henryscheible commited on
Commit
ddddf13
1 Parent(s): 34a5360

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +133 -0
README.md ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - generated_from_trainer
5
+ datasets:
6
+ - crows_pairs
7
+ metrics:
8
+ - accuracy
9
+ model-index:
10
+ - name: gpt2_crows_pairs_finetuned
11
+ results:
12
+ - task:
13
+ name: Text Classification
14
+ type: text-classification
15
+ dataset:
16
+ name: crows_pairs
17
+ type: crows_pairs
18
+ config: crows_pairs
19
+ split: test
20
+ args: crows_pairs
21
+ metrics:
22
+ - name: Accuracy
23
+ type: accuracy
24
+ value: 0.6357615894039735
25
+ ---
26
+
27
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
28
+ should probably proofread and complete it, then remove this comment. -->
29
+
30
+ # gpt2_crows_pairs_finetuned
31
+
32
+ This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the crows_pairs dataset.
33
+ It achieves the following results on the evaluation set:
34
+ - Loss: 0.6426
35
+ - Accuracy: 0.6358
36
+
37
+ ## Model description
38
+
39
+ More information needed
40
+
41
+ ## Intended uses & limitations
42
+
43
+ More information needed
44
+
45
+ ## Training and evaluation data
46
+
47
+ More information needed
48
+
49
+ ## Training procedure
50
+
51
+ ### Training hyperparameters
52
+
53
+ The following hyperparameters were used during training:
54
+ - learning_rate: 1e-05
55
+ - train_batch_size: 128
56
+ - eval_batch_size: 64
57
+ - seed: 42
58
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
59
+ - lr_scheduler_type: linear
60
+ - num_epochs: 30
61
+
62
+ ### Training results
63
+
64
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy |
65
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|
66
+ | No log | 0.5 | 5 | 1.8086 | 0.4934 |
67
+ | No log | 1.0 | 10 | 1.4229 | 0.4735 |
68
+ | No log | 1.5 | 15 | 1.3464 | 0.4768 |
69
+ | No log | 2.0 | 20 | 0.9351 | 0.5166 |
70
+ | No log | 2.5 | 25 | 0.8281 | 0.5099 |
71
+ | No log | 3.0 | 30 | 0.7829 | 0.4934 |
72
+ | No log | 3.5 | 35 | 0.7658 | 0.4901 |
73
+ | No log | 4.0 | 40 | 0.7416 | 0.4934 |
74
+ | No log | 4.5 | 45 | 0.7303 | 0.4801 |
75
+ | No log | 5.0 | 50 | 0.7305 | 0.5199 |
76
+ | No log | 5.5 | 55 | 0.7251 | 0.5132 |
77
+ | No log | 6.0 | 60 | 0.7145 | 0.5596 |
78
+ | No log | 6.5 | 65 | 0.7083 | 0.5497 |
79
+ | No log | 7.0 | 70 | 0.7021 | 0.5464 |
80
+ | No log | 7.5 | 75 | 0.6999 | 0.5232 |
81
+ | No log | 8.0 | 80 | 0.6929 | 0.5563 |
82
+ | No log | 8.5 | 85 | 0.6945 | 0.5662 |
83
+ | No log | 9.0 | 90 | 0.6863 | 0.5629 |
84
+ | No log | 9.5 | 95 | 0.6834 | 0.5662 |
85
+ | No log | 10.0 | 100 | 0.6819 | 0.5960 |
86
+ | No log | 10.5 | 105 | 0.6914 | 0.5563 |
87
+ | No log | 11.0 | 110 | 0.6822 | 0.5894 |
88
+ | No log | 11.5 | 115 | 0.6797 | 0.5662 |
89
+ | No log | 12.0 | 120 | 0.6782 | 0.5927 |
90
+ | No log | 12.5 | 125 | 0.6787 | 0.5861 |
91
+ | No log | 13.0 | 130 | 0.6783 | 0.5861 |
92
+ | No log | 13.5 | 135 | 0.6765 | 0.5894 |
93
+ | No log | 14.0 | 140 | 0.6696 | 0.6026 |
94
+ | No log | 14.5 | 145 | 0.6674 | 0.6126 |
95
+ | No log | 15.0 | 150 | 0.6669 | 0.5927 |
96
+ | No log | 15.5 | 155 | 0.6656 | 0.5993 |
97
+ | No log | 16.0 | 160 | 0.6684 | 0.5894 |
98
+ | No log | 16.5 | 165 | 0.6643 | 0.5960 |
99
+ | No log | 17.0 | 170 | 0.6608 | 0.6126 |
100
+ | No log | 17.5 | 175 | 0.6589 | 0.6126 |
101
+ | No log | 18.0 | 180 | 0.6614 | 0.5960 |
102
+ | No log | 18.5 | 185 | 0.6619 | 0.5861 |
103
+ | No log | 19.0 | 190 | 0.6603 | 0.5861 |
104
+ | No log | 19.5 | 195 | 0.6617 | 0.5894 |
105
+ | No log | 20.0 | 200 | 0.6561 | 0.6159 |
106
+ | No log | 20.5 | 205 | 0.6516 | 0.6225 |
107
+ | No log | 21.0 | 210 | 0.6493 | 0.6258 |
108
+ | No log | 21.5 | 215 | 0.6494 | 0.6325 |
109
+ | No log | 22.0 | 220 | 0.6512 | 0.6192 |
110
+ | No log | 22.5 | 225 | 0.6551 | 0.6192 |
111
+ | No log | 23.0 | 230 | 0.6518 | 0.6258 |
112
+ | No log | 23.5 | 235 | 0.6462 | 0.6325 |
113
+ | No log | 24.0 | 240 | 0.6461 | 0.6258 |
114
+ | No log | 24.5 | 245 | 0.6471 | 0.6358 |
115
+ | No log | 25.0 | 250 | 0.6478 | 0.6358 |
116
+ | No log | 25.5 | 255 | 0.6462 | 0.6358 |
117
+ | No log | 26.0 | 260 | 0.6440 | 0.6424 |
118
+ | No log | 26.5 | 265 | 0.6439 | 0.6358 |
119
+ | No log | 27.0 | 270 | 0.6436 | 0.6325 |
120
+ | No log | 27.5 | 275 | 0.6438 | 0.6325 |
121
+ | No log | 28.0 | 280 | 0.6428 | 0.6358 |
122
+ | No log | 28.5 | 285 | 0.6423 | 0.6391 |
123
+ | No log | 29.0 | 290 | 0.6426 | 0.6358 |
124
+ | No log | 29.5 | 295 | 0.6427 | 0.6358 |
125
+ | No log | 30.0 | 300 | 0.6426 | 0.6358 |
126
+
127
+
128
+ ### Framework versions
129
+
130
+ - Transformers 4.26.1
131
+ - Pytorch 1.13.1
132
+ - Datasets 2.10.1
133
+ - Tokenizers 0.13.2