mikaelsouza commited on
Commit
0bec60c
1 Parent(s): a8c73bc

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +172 -0
README.md ADDED
@@ -0,0 +1,172 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - generated_from_trainer
4
+ datasets:
5
+ - wikitext
6
+ model-index:
7
+ - name: msft-regular-model
8
+ results: []
9
+ ---
10
+
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
+
14
+ # msft-regular-model
15
+
16
+ This model is a fine-tuned version of [](https://huggingface.co/) on the wikitext dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - Loss: 5.3420
19
+
20
+ ## Model description
21
+
22
+ More information needed
23
+
24
+ ## Intended uses & limitations
25
+
26
+ More information needed
27
+
28
+ ## Training and evaluation data
29
+
30
+ More information needed
31
+
32
+ ## Training procedure
33
+
34
+ ### Training hyperparameters
35
+
36
+ The following hyperparameters were used during training:
37
+ - learning_rate: 5e-05
38
+ - train_batch_size: 16
39
+ - eval_batch_size: 16
40
+ - seed: 42
41
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
+ - lr_scheduler_type: linear
43
+ - num_epochs: 20
44
+
45
+ ### Training results
46
+
47
+ | Training Loss | Epoch | Step | Validation Loss |
48
+ |:-------------:|:-----:|:-----:|:---------------:|
49
+ | 9.1224 | 0.17 | 200 | 8.0736 |
50
+ | 7.5229 | 0.34 | 400 | 7.1536 |
51
+ | 7.0122 | 0.51 | 600 | 6.9072 |
52
+ | 6.8296 | 0.69 | 800 | 6.7582 |
53
+ | 6.709 | 0.86 | 1000 | 6.6436 |
54
+ | 6.5882 | 1.03 | 1200 | 6.5563 |
55
+ | 6.4807 | 1.2 | 1400 | 6.4784 |
56
+ | 6.4172 | 1.37 | 1600 | 6.4165 |
57
+ | 6.3403 | 1.54 | 1800 | 6.3555 |
58
+ | 6.2969 | 1.71 | 2000 | 6.3107 |
59
+ | 6.2346 | 1.89 | 2200 | 6.2691 |
60
+ | 6.1767 | 2.06 | 2400 | 6.2299 |
61
+ | 6.1326 | 2.23 | 2600 | 6.1937 |
62
+ | 6.1035 | 2.4 | 2800 | 6.1602 |
63
+ | 6.0624 | 2.57 | 3000 | 6.1241 |
64
+ | 6.0393 | 2.74 | 3200 | 6.0971 |
65
+ | 5.9982 | 2.91 | 3400 | 6.0656 |
66
+ | 5.9526 | 3.08 | 3600 | 6.0397 |
67
+ | 5.9086 | 3.26 | 3800 | 6.0104 |
68
+ | 5.8922 | 3.43 | 4000 | 5.9888 |
69
+ | 5.8631 | 3.6 | 4200 | 5.9661 |
70
+ | 5.8396 | 3.77 | 4400 | 5.9407 |
71
+ | 5.8055 | 3.94 | 4600 | 5.9177 |
72
+ | 5.7763 | 4.11 | 4800 | 5.9007 |
73
+ | 5.7314 | 4.28 | 5000 | 5.8834 |
74
+ | 5.7302 | 4.46 | 5200 | 5.8620 |
75
+ | 5.6987 | 4.63 | 5400 | 5.8451 |
76
+ | 5.6754 | 4.8 | 5600 | 5.8242 |
77
+ | 5.6571 | 4.97 | 5800 | 5.8059 |
78
+ | 5.615 | 5.14 | 6000 | 5.7871 |
79
+ | 5.596 | 5.31 | 6200 | 5.7817 |
80
+ | 5.5738 | 5.48 | 6400 | 5.7570 |
81
+ | 5.5641 | 5.66 | 6600 | 5.7431 |
82
+ | 5.5503 | 5.83 | 6800 | 5.7271 |
83
+ | 5.5214 | 6.0 | 7000 | 5.7108 |
84
+ | 5.4712 | 6.17 | 7200 | 5.7018 |
85
+ | 5.48 | 6.34 | 7400 | 5.6936 |
86
+ | 5.4527 | 6.51 | 7600 | 5.6812 |
87
+ | 5.4514 | 6.68 | 7800 | 5.6669 |
88
+ | 5.4454 | 6.86 | 8000 | 5.6509 |
89
+ | 5.399 | 7.03 | 8200 | 5.6408 |
90
+ | 5.3747 | 7.2 | 8400 | 5.6327 |
91
+ | 5.3667 | 7.37 | 8600 | 5.6197 |
92
+ | 5.3652 | 7.54 | 8800 | 5.6084 |
93
+ | 5.3394 | 7.71 | 9000 | 5.5968 |
94
+ | 5.3349 | 7.88 | 9200 | 5.5870 |
95
+ | 5.2994 | 8.05 | 9400 | 5.5826 |
96
+ | 5.2793 | 8.23 | 9600 | 5.5710 |
97
+ | 5.2716 | 8.4 | 9800 | 5.5623 |
98
+ | 5.275 | 8.57 | 10000 | 5.5492 |
99
+ | 5.264 | 8.74 | 10200 | 5.5449 |
100
+ | 5.241 | 8.91 | 10400 | 5.5322 |
101
+ | 5.2285 | 9.08 | 10600 | 5.5267 |
102
+ | 5.2021 | 9.25 | 10800 | 5.5187 |
103
+ | 5.1934 | 9.43 | 11000 | 5.5158 |
104
+ | 5.1737 | 9.6 | 11200 | 5.5044 |
105
+ | 5.1774 | 9.77 | 11400 | 5.5008 |
106
+ | 5.1841 | 9.94 | 11600 | 5.4960 |
107
+ | 5.1414 | 10.11 | 11800 | 5.4895 |
108
+ | 5.1491 | 10.28 | 12000 | 5.4849 |
109
+ | 5.1184 | 10.45 | 12200 | 5.4738 |
110
+ | 5.1136 | 10.63 | 12400 | 5.4690 |
111
+ | 5.1199 | 10.8 | 12600 | 5.4598 |
112
+ | 5.1056 | 10.97 | 12800 | 5.4536 |
113
+ | 5.0648 | 11.14 | 13000 | 5.4496 |
114
+ | 5.0598 | 11.31 | 13200 | 5.4449 |
115
+ | 5.0656 | 11.48 | 13400 | 5.4422 |
116
+ | 5.0664 | 11.65 | 13600 | 5.4367 |
117
+ | 5.0675 | 11.83 | 13800 | 5.4286 |
118
+ | 5.0459 | 12.0 | 14000 | 5.4249 |
119
+ | 5.0073 | 12.17 | 14200 | 5.4260 |
120
+ | 5.0229 | 12.34 | 14400 | 5.4175 |
121
+ | 5.0079 | 12.51 | 14600 | 5.4119 |
122
+ | 5.0 | 12.68 | 14800 | 5.4194 |
123
+ | 5.0094 | 12.85 | 15000 | 5.4068 |
124
+ | 4.9967 | 13.02 | 15200 | 5.3995 |
125
+ | 4.9541 | 13.2 | 15400 | 5.4002 |
126
+ | 4.9753 | 13.37 | 15600 | 5.3965 |
127
+ | 4.9732 | 13.54 | 15800 | 5.3925 |
128
+ | 4.9624 | 13.71 | 16000 | 5.3888 |
129
+ | 4.9559 | 13.88 | 16200 | 5.3824 |
130
+ | 4.9559 | 14.05 | 16400 | 5.3851 |
131
+ | 4.9109 | 14.22 | 16600 | 5.3815 |
132
+ | 4.9211 | 14.4 | 16800 | 5.3784 |
133
+ | 4.9342 | 14.57 | 17000 | 5.3735 |
134
+ | 4.9271 | 14.74 | 17200 | 5.3711 |
135
+ | 4.9328 | 14.91 | 17400 | 5.3646 |
136
+ | 4.8994 | 15.08 | 17600 | 5.3664 |
137
+ | 4.8932 | 15.25 | 17800 | 5.3642 |
138
+ | 4.8886 | 15.42 | 18000 | 5.3620 |
139
+ | 4.8997 | 15.6 | 18200 | 5.3584 |
140
+ | 4.8846 | 15.77 | 18400 | 5.3551 |
141
+ | 4.8993 | 15.94 | 18600 | 5.3516 |
142
+ | 4.8648 | 16.11 | 18800 | 5.3552 |
143
+ | 4.8838 | 16.28 | 19000 | 5.3512 |
144
+ | 4.8575 | 16.45 | 19200 | 5.3478 |
145
+ | 4.8623 | 16.62 | 19400 | 5.3480 |
146
+ | 4.8631 | 16.8 | 19600 | 5.3439 |
147
+ | 4.8576 | 16.97 | 19800 | 5.3428 |
148
+ | 4.8265 | 17.14 | 20000 | 5.3420 |
149
+ | 4.8523 | 17.31 | 20200 | 5.3410 |
150
+ | 4.8477 | 17.48 | 20400 | 5.3396 |
151
+ | 4.8507 | 17.65 | 20600 | 5.3380 |
152
+ | 4.8498 | 17.82 | 20800 | 5.3333 |
153
+ | 4.8261 | 17.99 | 21000 | 5.3342 |
154
+ | 4.8201 | 18.17 | 21200 | 5.3324 |
155
+ | 4.8214 | 18.34 | 21400 | 5.3341 |
156
+ | 4.8195 | 18.51 | 21600 | 5.3315 |
157
+ | 4.8216 | 18.68 | 21800 | 5.3335 |
158
+ | 4.8243 | 18.85 | 22000 | 5.3291 |
159
+ | 4.832 | 19.02 | 22200 | 5.3295 |
160
+ | 4.8085 | 19.19 | 22400 | 5.3309 |
161
+ | 4.8094 | 19.37 | 22600 | 5.3283 |
162
+ | 4.815 | 19.54 | 22800 | 5.3280 |
163
+ | 4.8219 | 19.71 | 23000 | 5.3270 |
164
+ | 4.8117 | 19.88 | 23200 | 5.3280 |
165
+
166
+
167
+ ### Framework versions
168
+
169
+ - Transformers 4.13.0.dev0
170
+ - Pytorch 1.10.0
171
+ - Datasets 1.14.0
172
+ - Tokenizers 0.10.3