IParraMartin commited on
Commit
43c52e3
·
verified ·
1 Parent(s): c4c8029

End of training

Browse files
Files changed (2) hide show
  1. README.md +438 -0
  2. generation_config.json +6 -0
README.md ADDED
@@ -0,0 +1,438 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - generated_from_trainer
5
+ model-index:
6
+ - name: impossible-llms-dutch-random-trigram
7
+ results: []
8
+ ---
9
+
10
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
+ should probably proofread and complete it, then remove this comment. -->
12
+
13
+ # impossible-llms-dutch-random-trigram
14
+
15
+ This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
16
+ It achieves the following results on the evaluation set:
17
+ - Loss: 7.1767
18
+
19
+ ## Model description
20
+
21
+ More information needed
22
+
23
+ ## Intended uses & limitations
24
+
25
+ More information needed
26
+
27
+ ## Training and evaluation data
28
+
29
+ More information needed
30
+
31
+ ## Training procedure
32
+
33
+ ### Training hyperparameters
34
+
35
+ The following hyperparameters were used during training:
36
+ - learning_rate: 0.0001
37
+ - train_batch_size: 12
38
+ - eval_batch_size: 8
39
+ - seed: 0
40
+ - distributed_type: multi-GPU
41
+ - num_devices: 4
42
+ - gradient_accumulation_steps: 8
43
+ - total_train_batch_size: 384
44
+ - total_eval_batch_size: 32
45
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
+ - lr_scheduler_type: cosine
47
+ - lr_scheduler_warmup_ratio: 0.1
48
+ - training_steps: 3000
49
+ - mixed_precision_training: Native AMP
50
+ - label_smoothing_factor: 0.1
51
+
52
+ ### Training results
53
+
54
+ | Training Loss | Epoch | Step | Validation Loss |
55
+ |:-------------:|:-----:|:----:|:---------------:|
56
+ | 82.5994 | 1.0 | 8 | 10.0542 |
57
+ | 75.3584 | 2.0 | 16 | 9.2715 |
58
+ | 71.8298 | 3.0 | 24 | 8.8846 |
59
+ | 69.7388 | 4.0 | 32 | 8.6622 |
60
+ | 67.9059 | 5.0 | 40 | 8.4515 |
61
+ | 65.8943 | 6.0 | 48 | 8.2601 |
62
+ | 64.9371 | 7.0 | 56 | 8.0639 |
63
+ | 64.1393 | 8.0 | 64 | 7.8508 |
64
+ | 61.5761 | 9.0 | 72 | 7.6357 |
65
+ | 60.1545 | 10.0 | 80 | 7.4159 |
66
+ | 58.5955 | 11.0 | 88 | 7.1999 |
67
+ | 56.5156 | 12.0 | 96 | 6.9792 |
68
+ | 54.4413 | 13.0 | 104 | 6.7578 |
69
+ | 52.4555 | 14.0 | 112 | 6.5518 |
70
+ | 51.4514 | 15.0 | 120 | 6.3584 |
71
+ | 50.1681 | 16.0 | 128 | 6.2014 |
72
+ | 48.6725 | 17.0 | 136 | 6.0718 |
73
+ | 48.0363 | 18.0 | 144 | 6.0036 |
74
+ | 47.4449 | 19.0 | 152 | 5.9306 |
75
+ | 47.7943 | 20.0 | 160 | 5.8906 |
76
+ | 46.7669 | 21.0 | 168 | 5.8501 |
77
+ | 46.9906 | 22.0 | 176 | 5.8311 |
78
+ | 46.7802 | 23.0 | 184 | 5.8032 |
79
+ | 46.3167 | 24.0 | 192 | 5.7732 |
80
+ | 46.1253 | 25.0 | 200 | 5.7465 |
81
+ | 46.1639 | 26.0 | 208 | 5.7233 |
82
+ | 45.7313 | 27.0 | 216 | 5.7011 |
83
+ | 44.8697 | 28.0 | 224 | 5.6811 |
84
+ | 44.9405 | 29.0 | 232 | 5.6618 |
85
+ | 44.7098 | 30.0 | 240 | 5.6458 |
86
+ | 44.5541 | 31.0 | 248 | 5.6263 |
87
+ | 44.9985 | 32.0 | 256 | 5.6188 |
88
+ | 44.7164 | 33.0 | 264 | 5.6017 |
89
+ | 44.3521 | 34.0 | 272 | 5.5895 |
90
+ | 44.1616 | 35.0 | 280 | 5.5672 |
91
+ | 44.4745 | 36.0 | 288 | 5.5586 |
92
+ | 43.884 | 37.0 | 296 | 5.5509 |
93
+ | 44.041 | 38.0 | 304 | 5.5372 |
94
+ | 43.8014 | 39.0 | 312 | 5.5197 |
95
+ | 43.5336 | 40.0 | 320 | 5.5044 |
96
+ | 43.8563 | 41.0 | 328 | 5.4919 |
97
+ | 43.3208 | 42.0 | 336 | 5.4814 |
98
+ | 42.4841 | 43.0 | 344 | 5.4666 |
99
+ | 42.682 | 44.0 | 352 | 5.4504 |
100
+ | 42.5942 | 45.0 | 360 | 5.4392 |
101
+ | 42.6471 | 46.0 | 368 | 5.4281 |
102
+ | 42.0434 | 47.0 | 376 | 5.4126 |
103
+ | 42.0182 | 48.0 | 384 | 5.3987 |
104
+ | 41.2785 | 49.0 | 392 | 5.3823 |
105
+ | 41.9833 | 50.0 | 400 | 5.3747 |
106
+ | 41.2029 | 51.0 | 408 | 5.3579 |
107
+ | 41.4731 | 52.0 | 416 | 5.3561 |
108
+ | 41.0419 | 53.0 | 424 | 5.3445 |
109
+ | 41.1785 | 54.0 | 432 | 5.3324 |
110
+ | 40.6796 | 55.0 | 440 | 5.3220 |
111
+ | 40.8005 | 56.0 | 448 | 5.3085 |
112
+ | 40.8224 | 57.0 | 456 | 5.3090 |
113
+ | 40.7455 | 58.0 | 464 | 5.2966 |
114
+ | 40.1623 | 59.0 | 472 | 5.2913 |
115
+ | 39.9811 | 60.0 | 480 | 5.2814 |
116
+ | 39.9838 | 61.0 | 488 | 5.2715 |
117
+ | 39.6819 | 62.0 | 496 | 5.2678 |
118
+ | 39.9066 | 63.0 | 504 | 5.2580 |
119
+ | 39.4079 | 64.0 | 512 | 5.2548 |
120
+ | 39.2164 | 65.0 | 520 | 5.2503 |
121
+ | 39.5161 | 66.0 | 528 | 5.2464 |
122
+ | 39.2483 | 67.0 | 536 | 5.2417 |
123
+ | 38.9285 | 68.0 | 544 | 5.2479 |
124
+ | 38.82 | 69.0 | 552 | 5.2428 |
125
+ | 38.8687 | 70.0 | 560 | 5.2381 |
126
+ | 38.5354 | 71.0 | 568 | 5.2322 |
127
+ | 37.9785 | 72.0 | 576 | 5.2324 |
128
+ | 38.2662 | 73.0 | 584 | 5.2372 |
129
+ | 37.8792 | 74.0 | 592 | 5.2330 |
130
+ | 37.8011 | 75.0 | 600 | 5.2385 |
131
+ | 37.5622 | 76.0 | 608 | 5.2393 |
132
+ | 37.4363 | 77.0 | 616 | 5.2367 |
133
+ | 37.2311 | 78.0 | 624 | 5.2437 |
134
+ | 37.388 | 79.0 | 632 | 5.2430 |
135
+ | 36.7787 | 80.0 | 640 | 5.2490 |
136
+ | 36.752 | 81.0 | 648 | 5.2609 |
137
+ | 36.6417 | 82.0 | 656 | 5.2538 |
138
+ | 36.3101 | 83.0 | 664 | 5.2602 |
139
+ | 36.1801 | 84.0 | 672 | 5.2574 |
140
+ | 36.2202 | 85.0 | 680 | 5.2636 |
141
+ | 35.8867 | 86.0 | 688 | 5.2841 |
142
+ | 35.3909 | 87.0 | 696 | 5.2803 |
143
+ | 35.5727 | 88.0 | 704 | 5.2949 |
144
+ | 35.7404 | 89.0 | 712 | 5.2919 |
145
+ | 35.3867 | 90.0 | 720 | 5.3118 |
146
+ | 35.3958 | 91.0 | 728 | 5.3114 |
147
+ | 34.9035 | 92.0 | 736 | 5.3233 |
148
+ | 34.8338 | 93.0 | 744 | 5.3217 |
149
+ | 34.5983 | 94.0 | 752 | 5.3424 |
150
+ | 34.6717 | 95.0 | 760 | 5.3457 |
151
+ | 34.0202 | 96.0 | 768 | 5.3570 |
152
+ | 34.194 | 97.0 | 776 | 5.3590 |
153
+ | 33.7407 | 98.0 | 784 | 5.3684 |
154
+ | 33.6082 | 99.0 | 792 | 5.3842 |
155
+ | 33.8825 | 100.0 | 800 | 5.3972 |
156
+ | 33.5653 | 101.0 | 808 | 5.4093 |
157
+ | 33.0536 | 102.0 | 816 | 5.4197 |
158
+ | 33.2688 | 103.0 | 824 | 5.4239 |
159
+ | 32.8215 | 104.0 | 832 | 5.4438 |
160
+ | 32.8538 | 105.0 | 840 | 5.4522 |
161
+ | 32.3872 | 106.0 | 848 | 5.4627 |
162
+ | 32.6686 | 107.0 | 856 | 5.4723 |
163
+ | 32.5385 | 108.0 | 864 | 5.4829 |
164
+ | 32.0684 | 109.0 | 872 | 5.5106 |
165
+ | 32.2504 | 110.0 | 880 | 5.5188 |
166
+ | 32.0645 | 111.0 | 888 | 5.5214 |
167
+ | 31.6109 | 112.0 | 896 | 5.5390 |
168
+ | 31.5912 | 113.0 | 904 | 5.5493 |
169
+ | 31.2763 | 114.0 | 912 | 5.5714 |
170
+ | 31.3219 | 115.0 | 920 | 5.5731 |
171
+ | 31.103 | 116.0 | 928 | 5.5915 |
172
+ | 30.8593 | 117.0 | 936 | 5.6075 |
173
+ | 30.883 | 118.0 | 944 | 5.6222 |
174
+ | 30.5909 | 119.0 | 952 | 5.6391 |
175
+ | 30.0987 | 120.0 | 960 | 5.6483 |
176
+ | 30.4074 | 121.0 | 968 | 5.6613 |
177
+ | 30.1413 | 122.0 | 976 | 5.6761 |
178
+ | 29.9734 | 123.0 | 984 | 5.6864 |
179
+ | 29.761 | 124.0 | 992 | 5.6941 |
180
+ | 29.7224 | 125.0 | 1000 | 5.7183 |
181
+ | 29.5969 | 126.0 | 1008 | 5.7312 |
182
+ | 29.4819 | 127.0 | 1016 | 5.7498 |
183
+ | 29.122 | 128.0 | 1024 | 5.7577 |
184
+ | 28.8915 | 129.0 | 1032 | 5.7759 |
185
+ | 28.9803 | 130.0 | 1040 | 5.7829 |
186
+ | 28.7822 | 131.0 | 1048 | 5.7881 |
187
+ | 28.6867 | 132.0 | 1056 | 5.8079 |
188
+ | 28.5127 | 133.0 | 1064 | 5.8202 |
189
+ | 28.2518 | 134.0 | 1072 | 5.8397 |
190
+ | 27.9477 | 135.0 | 1080 | 5.8579 |
191
+ | 27.9133 | 136.0 | 1088 | 5.8667 |
192
+ | 27.9604 | 137.0 | 1096 | 5.8785 |
193
+ | 27.6479 | 138.0 | 1104 | 5.8923 |
194
+ | 27.4398 | 139.0 | 1112 | 5.8948 |
195
+ | 27.4453 | 140.0 | 1120 | 5.9274 |
196
+ | 27.107 | 141.0 | 1128 | 5.9330 |
197
+ | 27.1592 | 142.0 | 1136 | 5.9402 |
198
+ | 26.7765 | 143.0 | 1144 | 5.9617 |
199
+ | 26.7436 | 144.0 | 1152 | 5.9688 |
200
+ | 26.4797 | 145.0 | 1160 | 5.9934 |
201
+ | 26.6271 | 146.0 | 1168 | 5.9979 |
202
+ | 26.5695 | 147.0 | 1176 | 6.0227 |
203
+ | 26.2278 | 148.0 | 1184 | 6.0268 |
204
+ | 26.3147 | 149.0 | 1192 | 6.0485 |
205
+ | 26.0386 | 150.0 | 1200 | 6.0496 |
206
+ | 25.9994 | 151.0 | 1208 | 6.0736 |
207
+ | 25.6954 | 152.0 | 1216 | 6.0737 |
208
+ | 25.6808 | 153.0 | 1224 | 6.0932 |
209
+ | 25.5726 | 154.0 | 1232 | 6.1188 |
210
+ | 25.5548 | 155.0 | 1240 | 6.1280 |
211
+ | 25.3248 | 156.0 | 1248 | 6.1349 |
212
+ | 25.0167 | 157.0 | 1256 | 6.1471 |
213
+ | 24.9439 | 158.0 | 1264 | 6.1657 |
214
+ | 24.9627 | 159.0 | 1272 | 6.1655 |
215
+ | 24.797 | 160.0 | 1280 | 6.1841 |
216
+ | 24.8176 | 161.0 | 1288 | 6.1916 |
217
+ | 24.4445 | 162.0 | 1296 | 6.2120 |
218
+ | 24.4471 | 163.0 | 1304 | 6.2158 |
219
+ | 24.4066 | 164.0 | 1312 | 6.2300 |
220
+ | 24.1849 | 165.0 | 1320 | 6.2481 |
221
+ | 24.2606 | 166.0 | 1328 | 6.2574 |
222
+ | 24.1559 | 167.0 | 1336 | 6.2774 |
223
+ | 23.8622 | 168.0 | 1344 | 6.2832 |
224
+ | 23.7267 | 169.0 | 1352 | 6.2942 |
225
+ | 23.586 | 170.0 | 1360 | 6.3067 |
226
+ | 23.5871 | 171.0 | 1368 | 6.3185 |
227
+ | 23.3116 | 172.0 | 1376 | 6.3322 |
228
+ | 23.3358 | 173.0 | 1384 | 6.3359 |
229
+ | 23.2364 | 174.0 | 1392 | 6.3462 |
230
+ | 23.2253 | 175.0 | 1400 | 6.3604 |
231
+ | 23.0764 | 176.0 | 1408 | 6.3661 |
232
+ | 22.9777 | 177.0 | 1416 | 6.3790 |
233
+ | 22.843 | 178.0 | 1424 | 6.3884 |
234
+ | 22.7189 | 179.0 | 1432 | 6.4069 |
235
+ | 22.7373 | 180.0 | 1440 | 6.4238 |
236
+ | 22.6216 | 181.0 | 1448 | 6.4147 |
237
+ | 22.5603 | 182.0 | 1456 | 6.4370 |
238
+ | 22.3906 | 183.0 | 1464 | 6.4491 |
239
+ | 22.4381 | 184.0 | 1472 | 6.4585 |
240
+ | 22.1994 | 185.0 | 1480 | 6.4711 |
241
+ | 22.0592 | 186.0 | 1488 | 6.4803 |
242
+ | 21.9095 | 187.0 | 1496 | 6.4879 |
243
+ | 21.9612 | 188.0 | 1504 | 6.4955 |
244
+ | 21.9603 | 189.0 | 1512 | 6.5081 |
245
+ | 21.846 | 190.0 | 1520 | 6.5085 |
246
+ | 21.6954 | 191.0 | 1528 | 6.5342 |
247
+ | 21.6045 | 192.0 | 1536 | 6.5455 |
248
+ | 21.5128 | 193.0 | 1544 | 6.5483 |
249
+ | 21.4015 | 194.0 | 1552 | 6.5560 |
250
+ | 21.4992 | 195.0 | 1560 | 6.5607 |
251
+ | 21.296 | 196.0 | 1568 | 6.5677 |
252
+ | 21.2518 | 197.0 | 1576 | 6.5858 |
253
+ | 21.223 | 198.0 | 1584 | 6.5859 |
254
+ | 21.1109 | 199.0 | 1592 | 6.5945 |
255
+ | 21.0745 | 200.0 | 1600 | 6.6123 |
256
+ | 20.9234 | 201.0 | 1608 | 6.6232 |
257
+ | 20.8848 | 202.0 | 1616 | 6.6257 |
258
+ | 20.6494 | 203.0 | 1624 | 6.6360 |
259
+ | 20.5728 | 204.0 | 1632 | 6.6397 |
260
+ | 20.6611 | 205.0 | 1640 | 6.6523 |
261
+ | 20.6581 | 206.0 | 1648 | 6.6601 |
262
+ | 20.5148 | 207.0 | 1656 | 6.6650 |
263
+ | 20.3811 | 208.0 | 1664 | 6.6797 |
264
+ | 20.3773 | 209.0 | 1672 | 6.6829 |
265
+ | 20.3413 | 210.0 | 1680 | 6.6976 |
266
+ | 20.2472 | 211.0 | 1688 | 6.7077 |
267
+ | 20.0545 | 212.0 | 1696 | 6.7108 |
268
+ | 20.1101 | 213.0 | 1704 | 6.7161 |
269
+ | 19.9425 | 214.0 | 1712 | 6.7213 |
270
+ | 19.9614 | 215.0 | 1720 | 6.7333 |
271
+ | 19.8209 | 216.0 | 1728 | 6.7464 |
272
+ | 19.8237 | 217.0 | 1736 | 6.7525 |
273
+ | 19.6726 | 218.0 | 1744 | 6.7475 |
274
+ | 19.7297 | 219.0 | 1752 | 6.7594 |
275
+ | 19.6377 | 220.0 | 1760 | 6.7700 |
276
+ | 19.6132 | 221.0 | 1768 | 6.7751 |
277
+ | 19.5049 | 222.0 | 1776 | 6.7832 |
278
+ | 19.4827 | 223.0 | 1784 | 6.7866 |
279
+ | 19.3998 | 224.0 | 1792 | 6.7915 |
280
+ | 19.3534 | 225.0 | 1800 | 6.8059 |
281
+ | 19.2848 | 226.0 | 1808 | 6.8061 |
282
+ | 19.3685 | 227.0 | 1816 | 6.8101 |
283
+ | 19.2081 | 228.0 | 1824 | 6.8217 |
284
+ | 19.1761 | 229.0 | 1832 | 6.8128 |
285
+ | 19.1179 | 230.0 | 1840 | 6.8253 |
286
+ | 19.0693 | 231.0 | 1848 | 6.8385 |
287
+ | 18.9306 | 232.0 | 1856 | 6.8459 |
288
+ | 18.9219 | 233.0 | 1864 | 6.8500 |
289
+ | 18.8905 | 234.0 | 1872 | 6.8570 |
290
+ | 18.8549 | 235.0 | 1880 | 6.8631 |
291
+ | 18.7845 | 236.0 | 1888 | 6.8661 |
292
+ | 18.7904 | 237.0 | 1896 | 6.8749 |
293
+ | 18.7142 | 238.0 | 1904 | 6.8875 |
294
+ | 18.6035 | 239.0 | 1912 | 6.8926 |
295
+ | 18.5459 | 240.0 | 1920 | 6.8939 |
296
+ | 18.5899 | 241.0 | 1928 | 6.8945 |
297
+ | 18.5584 | 242.0 | 1936 | 6.9038 |
298
+ | 18.4848 | 243.0 | 1944 | 6.9132 |
299
+ | 18.5062 | 244.0 | 1952 | 6.9161 |
300
+ | 18.3082 | 245.0 | 1960 | 6.9171 |
301
+ | 18.3617 | 246.0 | 1968 | 6.9295 |
302
+ | 18.3946 | 247.0 | 1976 | 6.9279 |
303
+ | 18.2304 | 248.0 | 1984 | 6.9355 |
304
+ | 18.2184 | 249.0 | 1992 | 6.9421 |
305
+ | 18.213 | 250.0 | 2000 | 6.9424 |
306
+ | 18.1752 | 251.0 | 2008 | 6.9480 |
307
+ | 18.06 | 252.0 | 2016 | 6.9539 |
308
+ | 18.0693 | 253.0 | 2024 | 6.9560 |
309
+ | 18.0189 | 254.0 | 2032 | 6.9561 |
310
+ | 17.9015 | 255.0 | 2040 | 6.9649 |
311
+ | 17.9693 | 256.0 | 2048 | 6.9699 |
312
+ | 17.9178 | 257.0 | 2056 | 6.9857 |
313
+ | 17.8822 | 258.0 | 2064 | 6.9825 |
314
+ | 17.8456 | 259.0 | 2072 | 6.9852 |
315
+ | 17.8385 | 260.0 | 2080 | 6.9851 |
316
+ | 17.7816 | 261.0 | 2088 | 6.9962 |
317
+ | 17.7009 | 262.0 | 2096 | 6.9984 |
318
+ | 17.7425 | 263.0 | 2104 | 7.0047 |
319
+ | 17.6348 | 264.0 | 2112 | 7.0037 |
320
+ | 17.6382 | 265.0 | 2120 | 7.0135 |
321
+ | 17.7061 | 266.0 | 2128 | 7.0123 |
322
+ | 17.661 | 267.0 | 2136 | 7.0149 |
323
+ | 17.5448 | 268.0 | 2144 | 7.0211 |
324
+ | 17.4749 | 269.0 | 2152 | 7.0287 |
325
+ | 17.5358 | 270.0 | 2160 | 7.0287 |
326
+ | 17.4606 | 271.0 | 2168 | 7.0346 |
327
+ | 17.4813 | 272.0 | 2176 | 7.0378 |
328
+ | 17.403 | 273.0 | 2184 | 7.0462 |
329
+ | 17.4206 | 274.0 | 2192 | 7.0419 |
330
+ | 17.4906 | 275.0 | 2200 | 7.0413 |
331
+ | 17.3353 | 276.0 | 2208 | 7.0498 |
332
+ | 17.3957 | 277.0 | 2216 | 7.0507 |
333
+ | 17.3451 | 278.0 | 2224 | 7.0582 |
334
+ | 17.3083 | 279.0 | 2232 | 7.0585 |
335
+ | 17.2388 | 280.0 | 2240 | 7.0610 |
336
+ | 17.2831 | 281.0 | 2248 | 7.0702 |
337
+ | 17.1745 | 282.0 | 2256 | 7.0705 |
338
+ | 17.1825 | 283.0 | 2264 | 7.0736 |
339
+ | 17.1351 | 284.0 | 2272 | 7.0730 |
340
+ | 17.1355 | 285.0 | 2280 | 7.0778 |
341
+ | 17.1596 | 286.0 | 2288 | 7.0801 |
342
+ | 17.0965 | 287.0 | 2296 | 7.0782 |
343
+ | 17.0982 | 288.0 | 2304 | 7.0877 |
344
+ | 17.0794 | 289.0 | 2312 | 7.0873 |
345
+ | 16.9511 | 290.0 | 2320 | 7.1009 |
346
+ | 17.0132 | 291.0 | 2328 | 7.0933 |
347
+ | 16.9379 | 292.0 | 2336 | 7.0972 |
348
+ | 16.9018 | 293.0 | 2344 | 7.1025 |
349
+ | 16.9297 | 294.0 | 2352 | 7.1038 |
350
+ | 16.9443 | 295.0 | 2360 | 7.1024 |
351
+ | 16.9367 | 296.0 | 2368 | 7.1066 |
352
+ | 16.8805 | 297.0 | 2376 | 7.1074 |
353
+ | 16.8863 | 298.0 | 2384 | 7.1133 |
354
+ | 16.8961 | 299.0 | 2392 | 7.1092 |
355
+ | 16.8387 | 300.0 | 2400 | 7.1125 |
356
+ | 16.8368 | 301.0 | 2408 | 7.1157 |
357
+ | 16.8282 | 302.0 | 2416 | 7.1161 |
358
+ | 16.8568 | 303.0 | 2424 | 7.1210 |
359
+ | 16.8066 | 304.0 | 2432 | 7.1196 |
360
+ | 16.6857 | 305.0 | 2440 | 7.1241 |
361
+ | 16.7231 | 306.0 | 2448 | 7.1229 |
362
+ | 16.7 | 307.0 | 2456 | 7.1248 |
363
+ | 16.7097 | 308.0 | 2464 | 7.1302 |
364
+ | 16.6619 | 309.0 | 2472 | 7.1302 |
365
+ | 16.7357 | 310.0 | 2480 | 7.1317 |
366
+ | 16.6416 | 311.0 | 2488 | 7.1391 |
367
+ | 16.6208 | 312.0 | 2496 | 7.1367 |
368
+ | 16.6047 | 313.0 | 2504 | 7.1378 |
369
+ | 16.5973 | 314.0 | 2512 | 7.1393 |
370
+ | 16.571 | 315.0 | 2520 | 7.1402 |
371
+ | 16.5836 | 316.0 | 2528 | 7.1418 |
372
+ | 16.5634 | 317.0 | 2536 | 7.1435 |
373
+ | 16.5548 | 318.0 | 2544 | 7.1488 |
374
+ | 16.563 | 319.0 | 2552 | 7.1510 |
375
+ | 16.5766 | 320.0 | 2560 | 7.1483 |
376
+ | 16.4478 | 321.0 | 2568 | 7.1509 |
377
+ | 16.5622 | 322.0 | 2576 | 7.1535 |
378
+ | 16.4586 | 323.0 | 2584 | 7.1548 |
379
+ | 16.4832 | 324.0 | 2592 | 7.1542 |
380
+ | 16.4289 | 325.0 | 2600 | 7.1570 |
381
+ | 16.5299 | 326.0 | 2608 | 7.1548 |
382
+ | 16.4647 | 327.0 | 2616 | 7.1581 |
383
+ | 16.4929 | 328.0 | 2624 | 7.1577 |
384
+ | 16.4312 | 329.0 | 2632 | 7.1594 |
385
+ | 16.5021 | 330.0 | 2640 | 7.1604 |
386
+ | 16.4607 | 331.0 | 2648 | 7.1632 |
387
+ | 16.4328 | 332.0 | 2656 | 7.1623 |
388
+ | 16.3884 | 333.0 | 2664 | 7.1656 |
389
+ | 16.4128 | 334.0 | 2672 | 7.1655 |
390
+ | 16.4234 | 335.0 | 2680 | 7.1646 |
391
+ | 16.4392 | 336.0 | 2688 | 7.1665 |
392
+ | 16.3881 | 337.0 | 2696 | 7.1660 |
393
+ | 16.3477 | 338.0 | 2704 | 7.1682 |
394
+ | 16.4096 | 339.0 | 2712 | 7.1681 |
395
+ | 16.3908 | 340.0 | 2720 | 7.1702 |
396
+ | 16.3873 | 341.0 | 2728 | 7.1686 |
397
+ | 16.4087 | 342.0 | 2736 | 7.1711 |
398
+ | 16.3875 | 343.0 | 2744 | 7.1713 |
399
+ | 16.3314 | 344.0 | 2752 | 7.1716 |
400
+ | 16.3994 | 345.0 | 2760 | 7.1733 |
401
+ | 16.3845 | 346.0 | 2768 | 7.1713 |
402
+ | 16.3095 | 347.0 | 2776 | 7.1721 |
403
+ | 16.3001 | 348.0 | 2784 | 7.1725 |
404
+ | 16.3388 | 349.0 | 2792 | 7.1743 |
405
+ | 16.3279 | 350.0 | 2800 | 7.1716 |
406
+ | 16.3188 | 351.0 | 2808 | 7.1727 |
407
+ | 16.3254 | 352.0 | 2816 | 7.1741 |
408
+ | 16.4517 | 353.0 | 2824 | 7.1747 |
409
+ | 16.322 | 354.0 | 2832 | 7.1745 |
410
+ | 16.3631 | 355.0 | 2840 | 7.1748 |
411
+ | 16.3896 | 356.0 | 2848 | 7.1745 |
412
+ | 16.329 | 357.0 | 2856 | 7.1751 |
413
+ | 16.3249 | 358.0 | 2864 | 7.1754 |
414
+ | 16.3464 | 359.0 | 2872 | 7.1754 |
415
+ | 16.3886 | 360.0 | 2880 | 7.1760 |
416
+ | 16.3359 | 361.0 | 2888 | 7.1758 |
417
+ | 16.2931 | 362.0 | 2896 | 7.1759 |
418
+ | 16.3569 | 363.0 | 2904 | 7.1761 |
419
+ | 16.3704 | 364.0 | 2912 | 7.1762 |
420
+ | 16.3221 | 365.0 | 2920 | 7.1767 |
421
+ | 16.3058 | 366.0 | 2928 | 7.1768 |
422
+ | 16.2517 | 367.0 | 2936 | 7.1766 |
423
+ | 16.3604 | 368.0 | 2944 | 7.1764 |
424
+ | 16.3752 | 369.0 | 2952 | 7.1764 |
425
+ | 16.3373 | 370.0 | 2960 | 7.1766 |
426
+ | 16.3252 | 371.0 | 2968 | 7.1766 |
427
+ | 16.274 | 372.0 | 2976 | 7.1767 |
428
+ | 16.3587 | 373.0 | 2984 | 7.1767 |
429
+ | 16.3647 | 374.0 | 2992 | 7.1766 |
430
+ | 16.3286 | 375.0 | 3000 | 7.1767 |
431
+
432
+
433
+ ### Framework versions
434
+
435
+ - Transformers 4.49.0
436
+ - Pytorch 2.4.0+cu121
437
+ - Datasets 3.4.0
438
+ - Tokenizers 0.21.0
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 0,
4
+ "eos_token_id": 0,
5
+ "transformers_version": "4.49.0"
6
+ }