Taeyoon.Kim.DS

고정 헤더 영역

글 제목

메뉴 레이어

Taeyoon.Kim.DS

메뉴 리스트

  • 홈
  • 태그
  • 방명록
  • 분류 전체보기 (121)
    • AWS Certified Machine Learn.. (16)
    • 데이터 과학 (49)
    • Generative AI with Large La.. (39)
    • Hugging Face Course (10)
    • Random (1)
    • HashiCorp Certified :Terraf.. (1)
    • 유학.커리어 고민 (2)
    • 커피챗 (1)

검색 레이어

Taeyoon.Kim.DS

검색 영역

컨텐츠 검색

Hugging Face Course

  • Write your training loop in PyTorch

    2023.09.19 by Taeyoon.Kim.DS

  • TF predictions and metrics (F1 score, recall, precision etc)

    2023.09.19 by Taeyoon.Kim.DS

  • The Trainer API

    2023.09.15 by Taeyoon.Kim.DS

  • Transformers Pipeline, Tokenizer, Model, and Result

    2023.09.14 by Taeyoon.Kim.DS

  • The tokenization pipeline

    2023.09.14 by Taeyoon.Kim.DS

  • What happens inside the pipeline function? (TensorFlow)

    2023.09.13 by Taeyoon.Kim.DS

  • What happens inside the pipeline function? (PyTorch)

    2023.09.13 by Taeyoon.Kim.DS

  • The Transformer architecture

    2023.09.13 by Taeyoon.Kim.DS

Write your training loop in PyTorch

https://www.youtube.com/watch?v=Dh9CL8fyG80&list=PLo2EIpI_JMQvWfQndUesu0nPBAtZ9gP1o&index=30 from datasets import load_dataset load_dataset from transformers import AutoTokenizer, DataCollatorWithPadding raw_datasets = load_dataset("glue", "mrpc") # 데이터셋 checkpoint = "bert-base-cased" # 모델명 - 대소문자를 구분하는 버트 모델. tokenizer = AutoTokenizer.from_pretrained(checkpoint) # 토크나이저 - 사전학습된 모델을 이용한 토크나이저. 그..

Hugging Face Course 2023. 9. 19. 20:49

TF predictions and metrics (F1 score, recall, precision etc)

https://www.youtube.com/watch?v=nx10eh4CoOs&list=PLo2EIpI_JMQvWfQndUesu0nPBAtZ9gP1o&index=29 # ['logits']라면 우선 토크나이즈 된 데이터셋의 validation 데이터셋을 숫자로 만든 후, preds라는 변수에 저장. preds = model.predict(tokenized_datasets['validation'])['logits'] # 결과 확률값 = tensorflow의 nn 그리고 softmax값에 preds를 넣어서, 각 확률값이 도출되게 만듬. probabilities = tf.nn.softmax(preds) # class_preds는, probabilities에 도출된 확률값들에 대해서 argmax를 제공. ar..

Hugging Face Course 2023. 9. 19. 18:51

The Trainer API

Trainer - 1. Model 2. Training/ Validation/ Test dataset 3. Tokenizer 4. Data Collator 5. Hyperparameters 6. Metrics Evaluation/ Prediction ! pip install datasets transformers[sentencepiece] from datasets import load_dataset from transformers import AutoTokenizer, DataCollatorWithPadding raw_datasets = load_dataset("glue", "mrpc") checkpoint = "bert-base-cased" tokenizer = AutoTokenizer.from_pre..

Hugging Face Course 2023. 9. 15. 22:13

Transformers Pipeline, Tokenizer, Model, and Result

from PIL import Image import glob from transformers import BlipProcessor, BlipForConditionalGeneration processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-base") model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-base")

Hugging Face Course 2023. 9. 14. 23:16

The tokenization pipeline

https://www.youtube.com/watch?v=Yffk5aydLzg&list=PLo2EIpI_JMQvWfQndUesu0nPBAtZ9gP1o&index=17 A tokenizer takes texts as inputs and outputs numbers the associated model can make sense of. from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") inputs = tokenizer("Let's try to tokenize!") print(inputs["input-ids"] [101, 2292, 1005, 1055, 3046, 2000, 19..

Hugging Face Course 2023. 9. 14. 19:08

What happens inside the pipeline function? (TensorFlow)

Raw text --> Input IDs --> Logits --> Predictions (math.sotfmax) return_tensors = "pt" or "tf" Instantiate a Transformers model (PyTorch) config file -> config class -> model config https://www.youtube.com/watch?v=AhChOFRegn4&list=PLo2EIpI_JMQvWfQndUesu0nPBAtZ9gP1o&index=11

Hugging Face Course 2023. 9. 13. 19:33

What happens inside the pipeline function? (PyTorch)

https://www.youtube.com/watch?v=1pedAIvTWXk&list=PLo2EIpI_JMQvWfQndUesu0nPBAtZ9gP1o&index=9 Tokenizer Raw text (This course is amazing!) Input IDs [ 101, 2023, 2607, 2003, 6429, 999, 102] adding the start of the sentence, and end of the sentence. Raw text -> Tokens -> Special tokens -> Input IDs. Model The AutoModel class loads a model without its pretraining head.

Hugging Face Course 2023. 9. 13. 19:22

The Transformer architecture

https://www.youtube.com/watch?v=H39Z_720T5s&list=PLo2EIpI_JMQvWfQndUesu0nPBAtZ9gP1o&index=5 Encoders, decoders, encoder-decoders Encoder accepts text into numerical representations. the combination of the two parts is known as an encoder-decoder, or seq-seq trnsformer. https://www.youtube.com/watch?v=MUqNwgPjJvQ&list=PLo2EIpI_JMQvWfQndUesu0nPBAtZ9gP1o&index=6 BERT is a popular encoder. Welcome t..

Hugging Face Course 2023. 9. 13. 19:14

추가 정보

인기글

최신글

페이징

이전
1 2
다음
TISTORY
Taeyoon.Kim.DS © Magazine Lab
페이스북 트위터 인스타그램 유투브 메일

티스토리툴바