Graphcore
/

roberta-base-squad2

Question Answering

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

jimypbr commited on Mar 23, 2022

Commit

689344c

•

1 Parent(s): a11cf16

Update README.md

Files changed (1) hide show

README.md +59 -13

README.md CHANGED Viewed

@@ -9,27 +9,48 @@ model-index:
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # roberta-base-squad2
 This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the squad_v2 dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -48,7 +69,32 @@ The following hyperparameters were used during training:
 ### Training results
 ### Framework versions

   results: []
 ---
 # roberta-base-squad2
 This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the squad_v2 dataset.
 ## Training and evaluation data
+Trained and evaluated on the [squad_v2 dataset](https://huggingface.co/datasets/squad_v2).
 ## Training procedure
+Trained on 16 Graphcore Mk2 IPUs using [optimum-graphcore](https://github.com/huggingface/optimum-graphcore).
+Command line:
+```
+python examples/question-answering/run_qa.py \
+  --ipu_config_name Graphcore/roberta-base-ipu \
+  --model_name_or_path roberta-base \
+  --dataset_name squad_v2 \
+  --version_2_with_negative \
+  --do_train \
+  --do_eval \
+  --num_train_epochs 3 \
+  --per_device_train_batch_size 4 \
+  --per_device_eval_batch_size 2 \
+  --pod_type pod16 \
+  --learning_rate 7e-5 \
+  --max_seq_length 384 \
+  --doc_stride 128 \
+  --seed 1984 \
+  --lr_scheduler_type linear \
+  --loss_scaling 64 \
+  --weight_decay 0.01 \
+  --warmup_ratio 0.2 \
+  --logging_steps 1 \
+  --save_steps -1 \
+  --dataloader_num_workers 64 \
+  --output_dir roberta-base-squad2 \
+  --overwrite_output_dir \
+  --push_to_hub
+```
 ### Training hyperparameters
 The following hyperparameters were used during training:
 ### Training results
+```
+***** train metrics *****
+  epoch                    =        3.0
+  train_loss               =     0.9982
+  train_runtime            = 0:04:44.21
+  train_samples            =     131823
+  train_samples_per_second =    1391.43
+  train_steps_per_second   =      5.425
+***** eval metrics *****
+  epoch                  =     3.0
+  eval_HasAns_exact      = 78.1208
+  eval_HasAns_f1         = 84.6569
+  eval_HasAns_total      =    5928
+  eval_NoAns_exact       = 82.0353
+  eval_NoAns_f1          = 82.0353
+  eval_NoAns_total       =    5945
+  eval_best_exact        = 80.0809
+  eval_best_exact_thresh =     0.0
+  eval_best_f1           = 83.3442
+  eval_best_f1_thresh    =     0.0
+  eval_exact             = 80.0809
+  eval_f1                = 83.3442
+  eval_samples           =   12165
+  eval_total             =   11873
+```
 ### Framework versions