sgugger commited on
Commit
947a164
1 Parent(s): a626e2c

End of training

Browse files
README.md ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - generated_from_trainer
7
+ datasets:
8
+ - glue
9
+ metrics:
10
+ - accuracy
11
+ - f1
12
+ model-index:
13
+ - name: bert-finetuned-mrpc
14
+ results:
15
+ - task:
16
+ name: Text Classification
17
+ type: text-classification
18
+ dataset:
19
+ name: GLUE MRPC
20
+ type: glue
21
+ args: mrpc
22
+ metrics:
23
+ - name: Accuracy
24
+ type: accuracy
25
+ value: 0.8602941176470589
26
+ - name: F1
27
+ type: f1
28
+ value: 0.9032258064516129
29
+ ---
30
+
31
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
32
+ should probably proofread and complete it, then remove this comment. -->
33
+
34
+ # bert-finetuned-mrpc
35
+
36
+ This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the GLUE MRPC dataset.
37
+ It achieves the following results on the evaluation set:
38
+ - Loss: 0.5152
39
+ - Accuracy: 0.8603
40
+ - F1: 0.9032
41
+ - Combined Score: 0.8818
42
+
43
+ ## Model description
44
+
45
+ More information needed
46
+
47
+ ## Intended uses & limitations
48
+
49
+ More information needed
50
+
51
+ ## Training and evaluation data
52
+
53
+ More information needed
54
+
55
+ ## Training procedure
56
+
57
+ ### Training hyperparameters
58
+
59
+ The following hyperparameters were used during training:
60
+ - learning_rate: 5e-05
61
+ - train_batch_size: 8
62
+ - eval_batch_size: 8
63
+ - seed: 42
64
+ - distributed_type: multi-GPU
65
+ - num_devices: 2
66
+ - total_train_batch_size: 16
67
+ - total_eval_batch_size: 16
68
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
69
+ - lr_scheduler_type: linear
70
+ - num_epochs: 3.0
71
+
72
+ ### Training results
73
+
74
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Combined Score |
75
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:--------------:|
76
+ | No log | 1.0 | 230 | 0.3668 | 0.8431 | 0.8881 | 0.8656 |
77
+ | No log | 2.0 | 460 | 0.3751 | 0.8578 | 0.9017 | 0.8798 |
78
+ | 0.4264 | 3.0 | 690 | 0.5152 | 0.8603 | 0.9032 | 0.8818 |
79
+
80
+
81
+ ### Framework versions
82
+
83
+ - Transformers 4.11.0.dev0
84
+ - Pytorch 1.8.1+cu111
85
+ - Datasets 1.10.3.dev0
86
+ - Tokenizers 0.10.3
all_results.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 3.0,
3
+ "eval_accuracy": 0.8602941176470589,
4
+ "eval_combined_score": 0.8817599620493359,
5
+ "eval_f1": 0.9032258064516129,
6
+ "eval_loss": 0.5152415037155151,
7
+ "eval_runtime": 0.681,
8
+ "eval_samples": 408,
9
+ "eval_samples_per_second": 599.092,
10
+ "eval_steps_per_second": 38.177,
11
+ "train_loss": 0.36320947287739186,
12
+ "train_runtime": 105.4434,
13
+ "train_samples": 3668,
14
+ "train_samples_per_second": 104.359,
15
+ "train_steps_per_second": 6.544
16
+ }
emissions.csv ADDED
@@ -0,0 +1,2 @@
 
 
1
+ timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2
+ 2021-09-14T13:10:06,cc9a5ccd-7eba-40e5-92ae-3a66f62862bb,codecarbon,108.45673489570618,0.002376985104054234,0.011293382035678484,United States,USA,new york,N,,
eval_results.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 3.0,
3
+ "eval_accuracy": 0.8602941176470589,
4
+ "eval_combined_score": 0.8817599620493359,
5
+ "eval_f1": 0.9032258064516129,
6
+ "eval_loss": 0.5152415037155151,
7
+ "eval_runtime": 0.681,
8
+ "eval_samples": 408,
9
+ "eval_samples_per_second": 599.092,
10
+ "eval_steps_per_second": 38.177
11
+ }
runs/Sep14_13-08-06_brahms/events.out.tfevents.1631639298.brahms.1547751.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:302b588ed56ac04dbedbf63a8af766ea275d1461710cf5f59e8706b9cebe72ad
3
- size 4697
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eacb80277a67ae78f10eac26ff7d05a2dad3bf2575f44883bcc77cec49585253
3
+ size 5051
runs/Sep14_13-08-06_brahms/events.out.tfevents.1631639408.brahms.1547751.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a6c154852b8582d4a32049fe54e9756235538400310694e5f34fcbc2871f56c5
3
+ size 467
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 3.0,
3
+ "train_loss": 0.36320947287739186,
4
+ "train_runtime": 105.4434,
5
+ "train_samples": 3668,
6
+ "train_samples_per_second": 104.359,
7
+ "train_steps_per_second": 6.544
8
+ }
trainer_state.json ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 3.0,
5
+ "global_step": 690,
6
+ "is_hyper_param_search": false,
7
+ "is_local_process_zero": true,
8
+ "is_world_process_zero": true,
9
+ "log_history": [
10
+ {
11
+ "epoch": 1.0,
12
+ "eval_accuracy": 0.8431372549019608,
13
+ "eval_combined_score": 0.8656245715069245,
14
+ "eval_f1": 0.8881118881118882,
15
+ "eval_loss": 0.3667730391025543,
16
+ "eval_runtime": 0.621,
17
+ "eval_samples_per_second": 657.045,
18
+ "eval_steps_per_second": 41.87,
19
+ "step": 230
20
+ },
21
+ {
22
+ "epoch": 2.0,
23
+ "eval_accuracy": 0.8578431372549019,
24
+ "eval_combined_score": 0.8797690262545697,
25
+ "eval_f1": 0.9016949152542373,
26
+ "eval_loss": 0.375055193901062,
27
+ "eval_runtime": 0.6266,
28
+ "eval_samples_per_second": 651.108,
29
+ "eval_steps_per_second": 41.492,
30
+ "step": 460
31
+ },
32
+ {
33
+ "epoch": 2.17,
34
+ "learning_rate": 1.3768115942028985e-05,
35
+ "loss": 0.4264,
36
+ "step": 500
37
+ },
38
+ {
39
+ "epoch": 3.0,
40
+ "eval_accuracy": 0.8602941176470589,
41
+ "eval_combined_score": 0.8817599620493359,
42
+ "eval_f1": 0.9032258064516129,
43
+ "eval_loss": 0.5152415037155151,
44
+ "eval_runtime": 0.5833,
45
+ "eval_samples_per_second": 699.514,
46
+ "eval_steps_per_second": 44.577,
47
+ "step": 690
48
+ },
49
+ {
50
+ "epoch": 3.0,
51
+ "step": 690,
52
+ "total_flos": 723818515529728.0,
53
+ "train_loss": 0.36320947287739186,
54
+ "train_runtime": 105.4434,
55
+ "train_samples_per_second": 104.359,
56
+ "train_steps_per_second": 6.544
57
+ }
58
+ ],
59
+ "max_steps": 690,
60
+ "num_train_epochs": 3,
61
+ "total_flos": 723818515529728.0,
62
+ "trial_name": null,
63
+ "trial_params": null
64
+ }