Fine-tune a pretrained model
äºååŠç¿æžã¿ã¢ãã«ã䜿çšãããšãèšç®ã³ã¹ããåæžããççŽ æåºéãæžå°ããããŒãããã¢ãã«ããã¬ãŒãã³ã°ããå¿ èŠãªãã«ææ°ã®ã¢ãã«ã䜿çšã§ããå©ç¹ããããŸãã ð€ Transformersã¯ãããŸããŸãªã¿ã¹ã¯ã«å¯Ÿå¿ããæ°åãã®äºååŠç¿æžã¿ã¢ãã«ãžã®ã¢ã¯ã»ã¹ãæäŸããŸãã äºååŠç¿æžã¿ã¢ãã«ã䜿çšããå Žåããããç¹å®ã®ã¿ã¹ã¯ã«åãããããŒã¿ã»ããã§ãã¬ãŒãã³ã°ããŸããããã¯ãã¡ã€ã³ãã¥ãŒãã³ã°ãšããŠç¥ãããéåžžã«åŒ·åãªãã¬ãŒãã³ã°æè¡ã§ãã ãã®ãã¥ãŒããªã¢ã«ã§ã¯ãäºååŠç¿æžã¿ã¢ãã«ãéžæãããã£ãŒãã©ãŒãã³ã°ãã¬ãŒã ã¯ãŒã¯ã§ãã¡ã€ã³ãã¥ãŒãã³ã°ããæ¹æ³ã«ã€ããŠèª¬æããŸãïŒ
- ð€ Transformersã®Trainerã䜿çšããŠäºååŠç¿æžã¿ã¢ãã«ããã¡ã€ã³ãã¥ãŒãã³ã°ããã
- TensorFlowãšKerasã䜿çšããŠäºååŠç¿æžã¿ã¢ãã«ããã¡ã€ã³ãã¥ãŒãã³ã°ããã
- ãã€ãã£ãã®PyTorchã䜿çšããŠäºååŠç¿æžã¿ã¢ãã«ããã¡ã€ã³ãã¥ãŒãã³ã°ããã
Prepare a dataset
äºååŠç¿æžã¿ã¢ãã«ããã¡ã€ã³ãã¥ãŒãã³ã°ããåã«ãããŒã¿ã»ãããããŠã³ããŒãããŠãã¬ãŒãã³ã°çšã«æºåããå¿ èŠããããŸãã åã®ãã¥ãŒããªã¢ã«ã§ã¯ããã¬ãŒãã³ã°ããŒã¿ã®åŠçæ¹æ³ã説æããŸããããããããã¯ãããã®ã¹ãã«ã掻ããæ©äŒããããŸãïŒ
ãŸããYelp ReviewsããŒã¿ã»ãããèªã¿èŸŒãã§ã¿ãŸãããïŒ
>>> from datasets import load_dataset
>>> dataset = load_dataset("yelp_review_full")
>>> dataset["train"][100]
{'label': 0,
'text': 'My expectations for McDonalds are t rarely high. But for one to still fail so spectacularly...that takes something special!\\nThe cashier took my friends\'s order, then promptly ignored me. I had to force myself in front of a cashier who opened his register to wait on the person BEHIND me. I waited over five minutes for a gigantic order that included precisely one kid\'s meal. After watching two people who ordered after me be handed their food, I asked where mine was. The manager started yelling at the cashiers for \\"serving off their orders\\" when they didn\'t have their food. But neither cashier was anywhere near those controls, and the manager was the one serving food to customers and clearing the boards.\\nThe manager was rude when giving me my order. She didn\'t make sure that I had everything ON MY RECEIPT, and never even had the decency to apologize that I felt I was getting poor service.\\nI\'ve eaten at various McDonalds restaurants for over 30 years. I\'ve worked at more than one location. I expect bad days, bad moods, and the occasional mistake. But I have yet to have a decent experience at this store. It will remain a place I avoid unless someone in my party needs to avoid illness from low blood sugar. Perhaps I should go back to the racially biased service of Steak n Shake instead!'}
ããŒã¯ãã€ã¶ãããã¹ããåŠçããå¯å€ã®ã·ãŒã±ã³ã¹é·ãåŠçããããã®ããã£ã³ã°ãšåãæšãŠæŠç¥ãå«ããå¿
èŠãããããšããåç¥ã®éãã
ããŒã¿ã»ããã1ã€ã®ã¹ãããã§åŠçããã«ã¯ãð€ Datasets ã® map
ã¡ãœããã䜿çšããŠã
ããŒã¿ã»ããå
šäœã«ååŠçé¢æ°ãé©çšããŸãïŒ
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-cased")
>>> def tokenize_function(examples):
... return tokenizer(examples["text"], padding="max_length", truncation=True)
>>> tokenized_datasets = dataset.map(tokenize_function, batched=True)
ã奜ã¿ã§ãå®è¡æéãççž®ããããã«ãã«ããŒã¿ã»ããã®å°ããªãµãã»ãããäœæããããšãã§ããŸãïŒ
>>> small_train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(1000))
>>> small_eval_dataset = tokenized_datasets["test"].shuffle(seed=42).select(range(1000))
Train
ãã®æç¹ã§ã䜿çšããããã¬ãŒã ã¯ãŒã¯ã«å¯Ÿå¿ããã»ã¯ã·ã§ã³ã«åŸãå¿ èŠããããŸããå³åŽã®ãµã€ãããŒã®ãªã³ã¯ã䜿çšããŠããžã£ã³ãããããã¬ãŒã ã¯ãŒã¯ã«ç§»åã§ããŸãã ãããŠãç¹å®ã®ãã¬ãŒã ã¯ãŒã¯ã®ãã¹ãŠã®ã³ã³ãã³ããé衚瀺ã«ãããå Žåã¯ããã®ãã¬ãŒã ã¯ãŒã¯ã®ãããã¯å³äžã«ãããã¿ã³ã䜿çšããŠãã ããïŒ
Train with Pytorch Trainer
ð€ Transformersã¯ãð€ Transformersã¢ãã«ã®ãã¬ãŒãã³ã°ãæé©åããTrainerã¯ã©ã¹ãæäŸããç¬èªã®ãã¬ãŒãã³ã°ã«ãŒããæåã§èšè¿°ããã«ãã¬ãŒãã³ã°ãéå§ããããããŠããŸãã Trainer APIã¯ããã°èšé²ãåŸé 环ç©ãæ··å粟床ãªã©ãããŸããŸãªãã¬ãŒãã³ã°ãªãã·ã§ã³ãšæ©èœããµããŒãããŠããŸãã
ãŸããã¢ãã«ãããŒãããäºæ³ãããã©ãã«ã®æ°ãæå®ããŸããYelp Review dataset cardããã5ã€ã®ã©ãã«ãããããšãããããŸãïŒ
>>> from transformers import AutoModelForSequenceClassification
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", num_labels=5)
äžéšã®äºååŠç¿æžã¿ã®éã¿ã䜿çšããããäžéšã®éã¿ãã©ã³ãã ã«åæåãããèŠåã衚瀺ãããããšããããŸããå¿é ããªãã§ãã ãããããã¯å®å šã«æ£åžžã§ãïŒ BERTã¢ãã«ã®äºååŠç¿æžã¿ã®ãããã¯ç Žæ£ãããã©ã³ãã ã«åæåãããåé¡ãããã§çœ®ãæããããŸãããã®æ°ããã¢ãã«ããããã·ãŒã±ã³ã¹åé¡ã¿ã¹ã¯ã§ãã¡ã€ã³ãã¥ãŒãã³ã°ããäºååŠç¿ã¢ãã«ã®ç¥èãããã«è»¢éããŸãã
Training Hyperparameters
次ã«ããã¬ãŒãã³ã°ãªãã·ã§ã³ãã¢ã¯ãã£ããŒãããããã®ãã¹ãŠã®ãã€ããŒãã©ã¡ãŒã¿ãšã調æŽã§ãããã€ããŒãã©ã¡ãŒã¿ãå«ãTrainingArgumentsã¯ã©ã¹ãäœæããŸãã ãã®ãã¥ãŒããªã¢ã«ã§ã¯ãããã©ã«ãã®ãã¬ãŒãã³ã°ãã€ããŒãã©ã¡ãŒã¿ã䜿çšããŠéå§ã§ããŸãããæé©ãªèšå®ãèŠã€ããããã«ããããå®éšããŠãæ§ããŸããã
ãã¬ãŒãã³ã°ã®ãã§ãã¯ãã€ã³ããä¿åããå Žæãæå®ããŸãïŒ
>>> from transformers import TrainingArguments
>>> training_args = TrainingArguments(output_dir="test_trainer")
Evaluate
Trainerã¯ãã¬ãŒãã³ã°äžã«èªåçã«ã¢ãã«ã®ããã©ãŒãã³ã¹ãè©äŸ¡ããŸãããã¡ããªã¯ã¹ãèšç®ããŠå ±åããé¢æ°ãTrainerã«æž¡ãå¿
èŠããããŸãã
ð€ Evaluateã©ã€ãã©ãªã§ã¯ãevaluate.load
é¢æ°ã䜿çšããŠèªã¿èŸŒãããšãã§ããã·ã³ãã«ãªaccuracy
é¢æ°ãæäŸãããŠããŸãïŒè©³çŽ°ã«ã€ããŠã¯ãã¡ãã®ã¯ã€ãã¯ãã¢ãŒãåç
§ããŠãã ããïŒïŒ
>>> import numpy as np
>>> import evaluate
>>> metric = evaluate.load("accuracy")
metric
ã®~evaluate.compute
ãåŒã³åºããŠãäºæž¬ã®æ£ç¢ºåºŠãèšç®ããŸãã compute
ã«äºæž¬ãæž¡ãåã«ãäºæž¬ãããžããã«å€æããå¿
èŠããããŸãïŒãã¹ãŠã®ð€ Transformersã¢ãã«ã¯ããžãããè¿ãããšãèŠããŠãããŠãã ããïŒïŒ
>>> def compute_metrics(eval_pred):
... logits, labels = eval_pred
... predictions = np.argmax(logits, axis=-1)
... return metric.compute(predictions=predictions, references=labels)
è©äŸ¡ã¡ããªã¯ã¹ããã¡ã€ã³ãã¥ãŒãã³ã°äžã«ç£èŠãããå Žåããã¬ãŒãã³ã°åŒæ°ã§ evaluation_strategy
ãã©ã¡ãŒã¿ãæå®ããŠãåãšããã¯ã®çµäºæã«è©äŸ¡ã¡ããªã¯ã¹ãå ±åããŸãïŒ
>>> from transformers import TrainingArguments, Trainer
>>> training_args = TrainingArguments(output_dir="test_trainer", evaluation_strategy="epoch")
Trainer
ã¢ãã«ããã¬ãŒãã³ã°åŒæ°ããã¬ãŒãã³ã°ããã³ãã¹ãããŒã¿ã»ãããè©äŸ¡é¢æ°ã䜿çšããŠTrainerãªããžã§ã¯ããäœæããŸãïŒ
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=small_train_dataset,
... eval_dataset=small_eval_dataset,
... compute_metrics=compute_metrics,
... )
ãã®åŸãtrain()ãåŒã³åºããŠã¢ãã«ã埮調æŽããŸãïŒ
>>> trainer.train()
Kerasã䜿çšããŠTensorFlowã¢ãã«ããã¬ãŒãã³ã°ãã
Keras APIã䜿çšããŠð€ Transformersã¢ãã«ãTensorFlowã§ãã¬ãŒãã³ã°ããããšãã§ããŸãïŒ
Loading Data from Keras
ð€ Transformersã¢ãã«ãKeras APIã§ãã¬ãŒãã³ã°ããå ŽåãããŒã¿ã»ãããKerasãç解ã§ãã圢åŒã«å€æããå¿ èŠããããŸãã ããŒã¿ã»ãããå°ããå ŽåãããŒã¿ã»ããå šäœãNumPyé åã«å€æããŠKerasã«æž¡ãããšãã§ããŸãã è€éãªããšãããåã«ããŸããããè©ŠããŠã¿ãŸãããã
ãŸããããŒã¿ã»ãããèªã¿èŸŒã¿ãŸããGLUEãã³ãããŒã¯ããCoLAããŒã¿ã»ããã䜿çšããŸã (GLUE Banchmark)ãããã¯åçŽãªãã€ããªããã¹ãåé¡ã¿ã¹ã¯ã§ããä»ã®ãšãããã¬ãŒãã³ã°åå²ã®ã¿ã䜿çšããŸãã
from datasets import load_dataset
dataset = load_dataset("glue", "cola")
dataset = dataset["train"] # ä»ã®ãšãããã¬ãŒãã³ã°åå²ã®ã¿ã䜿çšããŸã
次ã«ãããŒã¯ãã€ã¶ãããŒãããããŒã¿ãNumPyé
åãšããŠããŒã¯ã³åããŸããã©ãã«ã¯æ¢ã«0
ãš1
ã®ãªã¹ãã§ãããããããŒã¯ã³åããã«çŽæ¥NumPyé
åã«å€æã§ããŸãïŒ
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-cased")
tokenized_data = tokenizer(dataset["sentence"], return_tensors="np", padding=True)
# ããŒã¯ãã€ã¶ã¯BatchEncodingãè¿ããŸããããããKerasçšã«èŸæžã«å€æããŸã
tokenized_data = dict(tokenized_data)
labels = np.array(dataset["label"]) # ã©ãã«ã¯ãã§ã«0ãš1ã®é
åã§ã
æåŸã«ãã¢ãã«ãããŒãããcompile
ãš fit
ã¡ãœãããå®è¡ããŸãã
泚æç¹ãšããŠãTransformersã¢ãã«ã¯ãã¹ãŠããã©ã«ãã§ã¿ã¹ã¯ã«é¢é£ããæ倱é¢æ°ãæã£ãŠãããããæå®ããªããŠãæ§ããŸããïŒæå®ããå Žåãé€ãïŒïŒ
from transformers import TFAutoModelForSequenceClassification
from tensorflow.keras.optimizers import Adam
# ã¢ãã«ãããŒãããŠã³ã³ãã€ã«ãã
model = TFAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")
# ãã¡ã€ã³ãã¥ãŒãã³ã°ã«ã¯éåžžãåŠç¿çãäžãããšè¯ãã§ã
model.compile(optimizer=Adam(3e-5)) # æ倱é¢æ°ã®æå®ã¯äžèŠã§ãïŒ
model.fit(tokenized_data, labels)
ã¢ãã«ãcompile()
ããéã«loss
åŒæ°ãæž¡ãå¿
èŠã¯ãããŸããïŒHugging Faceã¢ãã«ã¯ããã®åŒæ°ã空çœã®ãŸãŸã«ããŠãããšãã¿ã¹ã¯ãšã¢ãã«ã¢ãŒããã¯ãã£ã«é©ããæ倱ãèªåçã«éžæããŸãã
å¿
èŠã«å¿ããŠèªåã§æ倱ãæå®ããŠãªãŒããŒã©ã€ãããããšãã§ããŸãïŒ
ãã®ã¢ãããŒãã¯ãå°èŠæš¡ãªããŒã¿ã»ããã«ã¯é©ããŠããŸããã倧èŠæš¡ãªããŒã¿ã»ããã«å¯ŸããŠã¯åé¡ã«ãªãããšããããŸãããªããªããããŒã¯ãã€ãºãããé åãšã©ãã«ã¯ã¡ã¢ãªã«å®å šã«èªã¿èŸŒãŸããå¿ èŠãããããŸãNumPyã¯ããžã£ã®ãŒããªé åãåŠçããªããããããŒã¯ãã€ãºãããåãµã³ãã«ãå šäœã®ããŒã¿ã»ããå ã§æãé·ããµã³ãã«ã®é·ãã«ããã£ã³ã°ããå¿ èŠããããŸãã ããã«ãããé åãããã«å€§ãããªãããã¹ãŠã®ããã£ã³ã°ããŒã¯ã³ããã¬ãŒãã³ã°ãé ãããåå ã«ãªããŸãïŒ
Loading data as a tf.data.Dataset
ãã¬ãŒãã³ã°ãé
ãããã«ããŒã¿ãèªã¿èŸŒãã«ã¯ãããŒã¿ãtf.data.Dataset
ãšããŠèªã¿èŸŒãããšãã§ããŸããç¬èªã®tf.data
ãã€ãã©ã€ã³ãäœæããããšãã§ããŸããããããè¡ãããã®äŸ¿å©ãªæ¹æ³ã2ã€ãããŸãïŒ
- prepare_tf_dataset(): ããã¯ã»ãšãã©ã®å Žåã§æšå¥šããæ¹æ³ã§ããã¢ãã«äžã®ã¡ãœãããªã®ã§ãã¢ãã«ãæ€æ»ããŠã¢ãã«å ¥åãšããŠäœ¿çšå¯èœãªåãèªåçã«ææ¡ããä»ã®åãç Žæ£ããŠããåçŽã§é«æ§èœãªããŒã¿ã»ãããäœæã§ããŸãã
to_tf_dataset
: ãã®ã¡ãœããã¯ããäœã¬ãã«ã§ãããŒã¿ã»ãããã©ã®ããã«äœæãããããæ£ç¢ºã«å¶åŸ¡ããå Žåã«äŸ¿å©ã§ããcolumns
ãšlabel_cols
ãæå®ããŠãããŒã¿ã»ããã«å«ããåãæ£ç¢ºã«æå®ã§ããŸãã
prepare_tf_dataset()ã䜿çšããåã«ã次ã®ã³ãŒããµã³ãã«ã«ç€ºãããã«ãããŒã¯ãã€ã¶ã®åºåãããŒã¿ã»ããã«åãšããŠè¿œå ããå¿ èŠããããŸãïŒ
def tokenize_dataset(data):
# è¿ãããèŸæžã®ããŒã¯ããŒã¿ã»ããã«åãšããŠè¿œå ãããŸã
return tokenizer(data["text"])
dataset = dataset.map(tokenize_dataset)
Hugging Faceã®ããŒã¿ã»ããã¯ããã©ã«ãã§ãã£ã¹ã¯ã«ä¿åããããããããã«ããã¡ã¢ãªã®äœ¿çšéãå¢ããããšã¯ãããŸããïŒ åãè¿œå ãããããããŒã¿ã»ããããããããã¹ããªãŒã ããåãããã«ããã£ã³ã°ãè¿œå ã§ããŸããããã«ããã ããŒã¿ã»ããå šäœã«ããã£ã³ã°ãè¿œå ããå Žåãšæ¯ã¹ãŠãããã£ã³ã°ããŒã¯ã³ã®æ°ãå€§å¹ ã«åæžãããŸãã
>>> tf_dataset = model.prepare_tf_dataset(dataset["train"], batch_size=16, shuffle=True, tokenizer=tokenizer)
äžèšã®ã³ãŒããµã³ãã«ã§ã¯ãããŒã¯ãã€ã¶ãprepare_tf_dataset
ã«æž¡ããŠãããããæ£ããèªã¿èŸŒãéã«æ£ããããã£ã³ã°ã§ããããã«ããå¿
èŠããããŸãã
ããŒã¿ã»ããã®ãã¹ãŠã®ãµã³ãã«ãåãé·ãã§ãããããã£ã³ã°ãäžèŠãªå Žåã¯ããã®åŒæ°ãã¹ãããã§ããŸãã
ããã£ã³ã°ä»¥å€ã®è€éãªåŠçãè¡ãå¿
èŠãããå ŽåïŒäŸïŒãã¹ã¯èšèªã¢ããªã³ã°ã®ããã®ããŒã¯ã³ã®ç Žæãªã©ïŒã
代ããã«collate_fn
åŒæ°ã䜿çšããŠããµã³ãã«ã®ãªã¹ãããããã«å€æããå¿
èŠãªååŠçãé©çšããé¢æ°ãæž¡ãããšãã§ããŸãã
ãã®ã¢ãããŒããå®éã«äœ¿çšããäŸã«ã€ããŠã¯ã
examplesã
notebooksãã芧ãã ããã
tf.data.Dataset
ãäœæãããã以åãšåæ§ã«ã¢ãã«ãã³ã³ãã€ã«ããé©åãããããšãã§ããŸãïŒ
model.compile(optimizer=Adam(3e-5)) # æ倱åŒæ°ã¯äžèŠã§ãïŒ
model.fit(tf_dataset)
Train in native Pytorch
Trainerã¯ãã¬ãŒãã³ã°ã«ãŒããåŠçãã1è¡ã®ã³ãŒãã§ã¢ãã«ããã¡ã€ã³ãã¥ãŒãã³ã°ã§ããããã«ããŸãã ãã¬ãŒãã³ã°ã«ãŒããç¬èªã«èšè¿°ããããŠãŒã¶ãŒã®ããã«ãð€ Transformersã¢ãã«ããã€ãã£ãã®PyTorchã§ãã¡ã€ã³ãã¥ãŒãã³ã°ããããšãã§ããŸãã
ãã®æç¹ã§ãããŒãããã¯ãåèµ·åãããã以äžã®ã³ãŒããå®è¡ããŠã¡ã¢ãªã解æŸããå¿ èŠããããããããŸããïŒ
del model
del trainer
torch.cuda.empty_cache()
- ã¢ãã«ã¯çã®ããã¹ããå
¥åãšããŠåãåããªãããã
text
åãåé€ããŸãïŒ
>>> tokenized_datasets = tokenized_datasets.remove_columns(["text"])
label
åãlabels
ã«ååãå€æŽããŸããã¢ãã«ã¯åŒæ°ã®ååãlabels
ãšæåŸ ããŠããŸãïŒ
>>> tokenized_datasets = tokenized_datasets.rename_column("label", "labels")
- ããŒã¿ã»ããã®åœ¢åŒããªã¹ãã§ã¯ãªãPyTorchãã³ãœã«ãè¿ãããã«èšå®ããŸãïŒ
>>> tokenized_datasets.set_format("torch")
以åã«ç€ºããããã«ããã¡ã€ã³ãã¥ãŒãã³ã°ãé«éåããããã«ããŒã¿ã»ããã®å°ããªãµãã»ãããäœæããŸãïŒ
>>> small_train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(1000))
>>> small_eval_dataset = tokenized_datasets["test"].shuffle(seed=42).select(range(1000))
DataLoader
ãã¬ãŒãã³ã°ããŒã¿ã»ãããšãã¹ãããŒã¿ã»ããçšã®DataLoader
ãäœæããŠãããŒã¿ã®ããããã€ãã¬ãŒãã§ããããã«ããŸãïŒ
>>> from torch.utils.data import DataLoader
>>> train_dataloader = DataLoader(small_train_dataset, shuffle=True, batch_size=8)
>>> eval_dataloader = DataLoader(small_eval_dataset, batch_size=8)
ããŒãããã¢ãã«ãšæåŸ ãããã©ãã«ã®æ°ãæå®ããŠãã ããïŒ
>>> from transformers import AutoModelForSequenceClassification
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", num_labels=5)
Optimizer and learning rate scheduler
ã¢ãã«ããã¡ã€ã³ãã¥ãŒãã³ã°ããããã®ãªããã£ãã€ã¶ãšåŠç¿çã¹ã±ãžã¥ãŒã©ãŒãäœæããŸãããã
PyTorchããAdamW
ãªããã£ãã€ã¶ã䜿çšããŸãïŒ
>>> from torch.optim import AdamW
>>> optimizer = AdamW(model.parameters(), lr=5e-5)
ããã©ã«ãã®åŠç¿çã¹ã±ãžã¥ãŒã©ãTrainerããäœæããïŒ
>>> from transformers import get_scheduler
>>> num_epochs = 3
>>> num_training_steps = num_epochs * len(train_dataloader)
>>> lr_scheduler = get_scheduler(
... name="linear", optimizer=optimizer, num_warmup_steps=0, num_training_steps=num_training_steps
... )
æåŸã«ãGPUãå©çšã§ããå Žå㯠device
ãæå®ããŠãã ããããã以å€ã®å ŽåãCPUã§ã®ãã¬ãŒãã³ã°ã¯æ°æéãããå¯èœæ§ããããæ°åã§å®äºããããšãã§ããŸãã
>>> import torch
>>> device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
>>> model.to(device)
ã¯ã©ãŠãGPUãå©çšã§ããªãå ŽåãColaboratoryãSageMaker StudioLabãªã©ã®ãã¹ããããããŒãããã¯ã䜿çšããŠç¡æã§GPUã«ã¢ã¯ã»ã¹ã§ããŸãã
ããŠããã¬ãŒãã³ã°ã®æºåãæŽããŸããïŒ ð¥³
ãã¬ãŒãã³ã°ã«ãŒã
ãã¬ãŒãã³ã°ã®é²æã远跡ããããã«ãtqdmã©ã€ãã©ãªã䜿çšããŠãã¬ãŒãã³ã°ã¹ãããã®æ°ã«å¯ŸããŠé²è¡ç¶æ³ããŒãè¿œå ããŸãïŒ
>>> from tqdm.auto import tqdm
>>> progress_bar = tqdm(range(num_training_steps))
>>> model.train()
>>> for epoch in range(num_epochs):
... for batch in train_dataloader:
... batch = {k: v.to(device) for k, v in batch.items()}
... outputs = model(**batch)
... loss = outputs.loss
... loss.backward()
... optimizer.step()
... lr_scheduler.step()
... optimizer.zero_grad()
... progress_bar.update(1)
Evaluate
Trainerã«è©äŸ¡é¢æ°ãè¿œå ããã®ãšåæ§ã«ãç¬èªã®ãã¬ãŒãã³ã°ã«ãŒããäœæããéã«ãåæ§ã®æäœãè¡ãå¿
èŠããããŸãã
ãã ããåãšããã¯ã®æåŸã«ã¡ããªãã¯ãèšç®ããã³å ±åãã代ããã«ãä»åã¯add_batch
ã䜿çšããŠãã¹ãŠã®ããããèç©ããæåŸã«ã¡ããªãã¯ãèšç®ããŸãã
>>> import evaluate
>>> metric = evaluate.load("accuracy")
>>> model.eval()
>>> for batch in eval_dataloader:
... batch = {k: v.to(device) for k, v in batch.items()}
... with torch.no_grad():
... outputs = model(**batch)
... logits = outputs.logits
... predictions = torch.argmax(logits, dim=-1)
... metric.add_batch(predictions=predictions, references=batch["labels"])
>>> metric.compute()
è¿œå ãªãœãŒã¹
ãããªããã¡ã€ã³ãã¥ãŒãã³ã°ã®äŸã«ã€ããŠã¯ã以äžãåç §ããŠãã ããïŒ
ð€ Transformers Examples ã«ã¯ãPyTorchãšTensorFlowã§äžè¬çãªNLPã¿ã¹ã¯ããã¬ãŒãã³ã°ããã¹ã¯ãªãããå«ãŸããŠããŸãã
ð€ Transformers Notebooks ã«ã¯ãç¹å®ã®ã¿ã¹ã¯ã«ã¢ãã«ããã¡ã€ã³ãã¥ãŒãã³ã°ããæ¹æ³ã«é¢ããããŸããŸãªããŒãããã¯ãå«ãŸããŠããŸãã