jimypbr commited on
Commit
689344c
1 Parent(s): a11cf16

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -13
README.md CHANGED
@@ -9,27 +9,48 @@ model-index:
9
  results: []
10
  ---
11
 
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
-
15
  # roberta-base-squad2
16
 
17
  This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the squad_v2 dataset.
18
 
19
- ## Model description
20
-
21
- More information needed
22
-
23
- ## Intended uses & limitations
24
-
25
- More information needed
26
-
27
  ## Training and evaluation data
28
 
29
- More information needed
30
 
31
  ## Training procedure
32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  ### Training hyperparameters
34
 
35
  The following hyperparameters were used during training:
@@ -48,7 +69,32 @@ The following hyperparameters were used during training:
48
 
49
  ### Training results
50
 
51
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
  ### Framework versions
54
 
 
9
  results: []
10
  ---
11
 
 
 
 
12
  # roberta-base-squad2
13
 
14
  This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the squad_v2 dataset.
15
 
 
 
 
 
 
 
 
 
16
  ## Training and evaluation data
17
 
18
+ Trained and evaluated on the [squad_v2 dataset](https://huggingface.co/datasets/squad_v2).
19
 
20
  ## Training procedure
21
 
22
+ Trained on 16 Graphcore Mk2 IPUs using [optimum-graphcore](https://github.com/huggingface/optimum-graphcore).
23
+
24
+ Command line:
25
+
26
+ ```
27
+ python examples/question-answering/run_qa.py \
28
+ --ipu_config_name Graphcore/roberta-base-ipu \
29
+ --model_name_or_path roberta-base \
30
+ --dataset_name squad_v2 \
31
+ --version_2_with_negative \
32
+ --do_train \
33
+ --do_eval \
34
+ --num_train_epochs 3 \
35
+ --per_device_train_batch_size 4 \
36
+ --per_device_eval_batch_size 2 \
37
+ --pod_type pod16 \
38
+ --learning_rate 7e-5 \
39
+ --max_seq_length 384 \
40
+ --doc_stride 128 \
41
+ --seed 1984 \
42
+ --lr_scheduler_type linear \
43
+ --loss_scaling 64 \
44
+ --weight_decay 0.01 \
45
+ --warmup_ratio 0.2 \
46
+ --logging_steps 1 \
47
+ --save_steps -1 \
48
+ --dataloader_num_workers 64 \
49
+ --output_dir roberta-base-squad2 \
50
+ --overwrite_output_dir \
51
+ --push_to_hub
52
+ ```
53
+
54
  ### Training hyperparameters
55
 
56
  The following hyperparameters were used during training:
 
69
 
70
  ### Training results
71
 
72
+ ```
73
+ ***** train metrics *****
74
+ epoch = 3.0
75
+ train_loss = 0.9982
76
+ train_runtime = 0:04:44.21
77
+ train_samples = 131823
78
+ train_samples_per_second = 1391.43
79
+ train_steps_per_second = 5.425
80
+
81
+ ***** eval metrics *****
82
+ epoch = 3.0
83
+ eval_HasAns_exact = 78.1208
84
+ eval_HasAns_f1 = 84.6569
85
+ eval_HasAns_total = 5928
86
+ eval_NoAns_exact = 82.0353
87
+ eval_NoAns_f1 = 82.0353
88
+ eval_NoAns_total = 5945
89
+ eval_best_exact = 80.0809
90
+ eval_best_exact_thresh = 0.0
91
+ eval_best_f1 = 83.3442
92
+ eval_best_f1_thresh = 0.0
93
+ eval_exact = 80.0809
94
+ eval_f1 = 83.3442
95
+ eval_samples = 12165
96
+ eval_total = 11873
97
+ ```
98
 
99
  ### Framework versions
100