File size: 15,339 Bytes
e481114
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
---
license: mit
base_model: facebook/m2m100_1.2B
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: cs_m2m_0.00001_200_v0.2
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# cs_m2m_0.00001_200_v0.2

This model is a fine-tuned version of [facebook/m2m100_1.2B](https://huggingface.co/facebook/m2m100_1.2B) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 8.4603
- Bleu: 0.1346
- Gen Len: 69.619

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 200

### Training results

| Training Loss | Epoch | Step | Validation Loss | Bleu   | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|
| 2.684         | 1.0   | 6    | 8.4517          | 0.0956 | 61.6667 |
| 1.978         | 2.0   | 12   | 8.4546          | 0.0985 | 61.8095 |
| 2.8654        | 3.0   | 18   | 8.4538          | 0.0961 | 62.4286 |
| 2.8165        | 4.0   | 24   | 8.4550          | 0.0991 | 63.1905 |
| 2.6606        | 5.0   | 30   | 8.4556          | 0.0956 | 61.0476 |
| 3.1159        | 6.0   | 36   | 8.4525          | 0.0964 | 60.5238 |
| 1.813         | 7.0   | 42   | 8.4524          | 0.0961 | 59.8095 |
| 2.9637        | 8.0   | 48   | 8.4520          | 0.0961 | 59.8095 |
| 2.1663        | 9.0   | 54   | 8.4526          | 0.0918 | 59.5714 |
| 2.475         | 10.0  | 60   | 8.4516          | 0.0916 | 59.381  |
| 2.5769        | 11.0  | 66   | 8.4493          | 0.0927 | 60.1905 |
| 2.414         | 12.0  | 72   | 8.4485          | 0.0927 | 60.1905 |
| 2.5985        | 13.0  | 78   | 8.4500          | 0.0946 | 60.1905 |
| 2.6263        | 14.0  | 84   | 8.4527          | 0.1003 | 61.0    |
| 2.2439        | 15.0  | 90   | 8.4533          | 0.0774 | 69.0952 |
| 1.9865        | 16.0  | 96   | 8.4542          | 0.0769 | 69.5238 |
| 2.2472        | 17.0  | 102  | 8.4540          | 0.0766 | 69.7619 |
| 2.5489        | 18.0  | 108  | 8.4534          | 0.0782 | 70.3333 |
| 1.9181        | 19.0  | 114  | 8.4527          | 0.0789 | 70.5714 |
| 2.0332        | 20.0  | 120  | 8.4505          | 0.0785 | 70.7619 |
| 1.9397        | 21.0  | 126  | 8.4488          | 0.0784 | 70.9048 |
| 2.788         | 22.0  | 132  | 8.4480          | 0.0772 | 71.9524 |
| 2.4842        | 23.0  | 138  | 8.4473          | 0.0778 | 71.6667 |
| 2.3397        | 24.0  | 144  | 8.4459          | 0.0975 | 62.6667 |
| 2.3303        | 25.0  | 150  | 8.4448          | 0.1314 | 71.9048 |
| 2.6417        | 26.0  | 156  | 8.4436          | 0.1311 | 71.9524 |
| 2.0759        | 27.0  | 162  | 8.4446          | 0.128  | 71.9524 |
| 2.0973        | 28.0  | 168  | 8.4450          | 0.1659 | 62.1905 |
| 2.9593        | 29.0  | 174  | 8.4455          | 0.1285 | 71.4762 |
| 3.0086        | 30.0  | 180  | 8.4442          | 0.1624 | 61.8571 |
| 2.684         | 31.0  | 186  | 8.4431          | 0.162  | 62.0952 |
| 2.7015        | 32.0  | 192  | 8.4442          | 0.162  | 62.0952 |
| 4.6745        | 33.0  | 198  | 8.4431          | 0.1624 | 62.9048 |
| 2.1913        | 34.0  | 204  | 8.4427          | 0.1607 | 63.0    |
| 2.1685        | 35.0  | 210  | 8.4443          | 0.1671 | 61.4286 |
| 2.3458        | 36.0  | 216  | 8.4458          | 0.1346 | 69.6667 |
| 2.0533        | 37.0  | 222  | 8.4456          | 0.132  | 70.1905 |
| 3.1101        | 38.0  | 228  | 8.4442          | 0.1335 | 69.8095 |
| 2.2737        | 39.0  | 234  | 8.4447          | 0.0787 | 70.7619 |
| 2.4838        | 40.0  | 240  | 8.4476          | 0.0784 | 70.1905 |
| 1.9048        | 41.0  | 246  | 8.4487          | 0.0801 | 70.4762 |
| 2.825         | 42.0  | 252  | 8.4495          | 0.0668 | 79.4286 |
| 1.7811        | 43.0  | 258  | 8.4521          | 0.0639 | 78.2381 |
| 2.1382        | 44.0  | 264  | 8.4545          | 0.0639 | 78.1429 |
| 2.2783        | 45.0  | 270  | 8.4553          | 0.0636 | 78.5714 |
| 2.1117        | 46.0  | 276  | 8.4558          | 0.0636 | 78.5714 |
| 2.0165        | 47.0  | 282  | 8.4563          | 0.0638 | 78.4762 |
| 2.2424        | 48.0  | 288  | 8.4568          | 0.0639 | 78.3333 |
| 2.7404        | 49.0  | 294  | 8.4564          | 0.0627 | 79.5714 |
| 3.3443        | 50.0  | 300  | 8.4560          | 0.0617 | 78.4762 |
| 2.7281        | 51.0  | 306  | 8.4551          | 0.0617 | 78.4762 |
| 2.9189        | 52.0  | 312  | 8.4520          | 0.0757 | 70.7143 |
| 2.3192        | 53.0  | 318  | 8.4512          | 0.0754 | 70.7619 |
| 2.3737        | 54.0  | 324  | 8.4505          | 0.0604 | 78.4286 |
| 2.4041        | 55.0  | 330  | 8.4490          | 0.0606 | 78.0952 |
| 4.5412        | 56.0  | 336  | 8.4478          | 0.0618 | 78.0952 |
| 2.399         | 57.0  | 342  | 8.4469          | 0.0617 | 78.2381 |
| 1.8226        | 58.0  | 348  | 8.4467          | 0.062  | 77.9048 |
| 2.3362        | 59.0  | 354  | 8.4463          | 0.0612 | 77.4762 |
| 2.4263        | 60.0  | 360  | 8.4450          | 0.0612 | 77.4762 |
| 2.7929        | 61.0  | 366  | 8.4439          | 0.0617 | 78.2381 |
| 3.2633        | 62.0  | 372  | 8.4434          | 0.0615 | 78.3333 |
| 2.3451        | 63.0  | 378  | 8.4436          | 0.0607 | 77.9048 |
| 2.8337        | 64.0  | 384  | 8.4429          | 0.061  | 77.4762 |
| 2.7405        | 65.0  | 390  | 8.4430          | 0.0607 | 77.9048 |
| 2.8955        | 66.0  | 396  | 8.4420          | 0.0614 | 78.6667 |
| 2.3475        | 67.0  | 402  | 8.4408          | 0.061  | 79.0952 |
| 2.0904        | 68.0  | 408  | 8.4383          | 0.0608 | 79.1905 |
| 2.4816        | 69.0  | 414  | 8.4367          | 0.0607 | 79.3333 |
| 2.3696        | 70.0  | 420  | 8.4365          | 0.0607 | 79.3333 |
| 2.7587        | 71.0  | 426  | 8.4364          | 0.0616 | 79.5714 |
| 2.0684        | 72.0  | 432  | 8.4369          | 0.0617 | 79.4762 |
| 2.5021        | 73.0  | 438  | 8.4375          | 0.0617 | 79.4762 |
| 1.4037        | 74.0  | 444  | 8.4362          | 0.0759 | 71.0476 |
| 2.1197        | 75.0  | 450  | 8.4357          | 0.0763 | 70.7619 |
| 2.2019        | 76.0  | 456  | 8.4378          | 0.0612 | 78.8571 |
| 1.8674        | 77.0  | 462  | 8.4402          | 0.062  | 77.7619 |
| 4.6628        | 78.0  | 468  | 8.4415          | 0.0769 | 69.3333 |
| 2.5704        | 79.0  | 474  | 8.4420          | 0.0769 | 69.3333 |
| 1.8771        | 80.0  | 480  | 8.4422          | 0.0772 | 69.1905 |
| 1.9444        | 81.0  | 486  | 8.4437          | 0.078  | 70.5238 |
| 2.0133        | 82.0  | 492  | 8.4443          | 0.0771 | 71.1429 |
| 2.8815        | 83.0  | 498  | 8.4445          | 0.0757 | 70.4286 |
| 3.0573        | 84.0  | 504  | 8.4455          | 0.0621 | 77.7143 |
| 2.011         | 85.0  | 510  | 8.4469          | 0.0621 | 77.7143 |
| 1.8176        | 86.0  | 516  | 8.4488          | 0.0621 | 77.7143 |
| 1.505         | 87.0  | 522  | 8.4512          | 0.0621 | 77.7143 |
| 5.016         | 88.0  | 528  | 8.4542          | 0.0622 | 77.5714 |
| 4.8956        | 89.0  | 534  | 8.4565          | 0.0625 | 77.1905 |
| 2.3939        | 90.0  | 540  | 8.4578          | 0.0625 | 77.1905 |
| 1.8629        | 91.0  | 546  | 8.4589          | 0.0622 | 77.5714 |
| 2.7315        | 92.0  | 552  | 8.4599          | 0.0617 | 78.1429 |
| 2.6185        | 93.0  | 558  | 8.4605          | 0.0618 | 78.1429 |
| 2.2754        | 94.0  | 564  | 8.4598          | 0.0617 | 78.2381 |
| 1.9322        | 95.0  | 570  | 8.4582          | 0.0616 | 78.381  |
| 2.1725        | 96.0  | 576  | 8.4583          | 0.0621 | 78.9524 |
| 2.603         | 97.0  | 582  | 8.4576          | 0.0619 | 79.1905 |
| 2.543         | 98.0  | 588  | 8.4569          | 0.0619 | 79.1905 |
| 2.4981        | 99.0  | 594  | 8.4563          | 0.0618 | 79.2857 |
| 1.8449        | 100.0 | 600  | 8.4561          | 0.063  | 80.0952 |
| 3.063         | 101.0 | 606  | 8.4559          | 0.0618 | 79.2857 |
| 1.7031        | 102.0 | 612  | 8.4564          | 0.0622 | 77.7143 |
| 2.6749        | 103.0 | 618  | 8.4563          | 0.0623 | 77.5714 |
| 2.5504        | 104.0 | 624  | 8.4558          | 0.0781 | 69.4286 |
| 1.785         | 105.0 | 630  | 8.4559          | 0.0791 | 69.4286 |
| 2.3876        | 106.0 | 636  | 8.4560          | 0.0753 | 70.5238 |
| 1.9649        | 107.0 | 642  | 8.4556          | 0.0613 | 78.4762 |
| 2.5544        | 108.0 | 648  | 8.4571          | 0.0617 | 78.3333 |
| 2.3048        | 109.0 | 654  | 8.4578          | 0.0619 | 77.9524 |
| 3.2234        | 110.0 | 660  | 8.4595          | 0.0618 | 77.9524 |
| 2.5271        | 111.0 | 666  | 8.4600          | 0.0619 | 77.7619 |
| 2.1592        | 112.0 | 672  | 8.4599          | 0.0621 | 77.8571 |
| 2.1582        | 113.0 | 678  | 8.4600          | 0.0618 | 77.9524 |
| 5.1356        | 114.0 | 684  | 8.4596          | 0.0622 | 77.6667 |
| 3.1661        | 115.0 | 690  | 8.4594          | 0.0622 | 77.7619 |
| 2.1159        | 116.0 | 696  | 8.4597          | 0.0617 | 78.2381 |
| 2.1355        | 117.0 | 702  | 8.4602          | 0.0612 | 78.7143 |
| 2.5071        | 118.0 | 708  | 8.4606          | 0.0631 | 79.9524 |
| 2.5419        | 119.0 | 714  | 8.4608          | 0.0631 | 80.0476 |
| 2.1749        | 120.0 | 720  | 8.4616          | 0.0617 | 79.381  |
| 2.1737        | 121.0 | 726  | 8.4622          | 0.0631 | 80.0476 |
| 2.2413        | 122.0 | 732  | 8.4623          | 0.0633 | 79.8095 |
| 2.2636        | 123.0 | 738  | 8.4624          | 0.0636 | 79.4762 |
| 2.9731        | 124.0 | 744  | 8.4624          | 0.0636 | 79.4762 |
| 2.6207        | 125.0 | 750  | 8.4621          | 0.0636 | 79.4762 |
| 2.6231        | 126.0 | 756  | 8.4602          | 0.0636 | 79.4762 |
| 2.4161        | 127.0 | 762  | 8.4605          | 0.0637 | 79.381  |
| 2.9764        | 128.0 | 768  | 8.4613          | 0.0762 | 70.9524 |
| 2.41          | 129.0 | 774  | 8.4618          | 0.0761 | 71.0476 |
| 2.1357        | 130.0 | 780  | 8.4620          | 0.0762 | 70.7143 |
| 3.211         | 131.0 | 786  | 8.4621          | 0.0762 | 70.7143 |
| 1.8992        | 132.0 | 792  | 8.4623          | 0.0633 | 79.7143 |
| 2.9689        | 133.0 | 798  | 8.4621          | 0.0631 | 79.9524 |
| 2.4456        | 134.0 | 804  | 8.4619          | 0.0629 | 80.0476 |
| 1.9567        | 135.0 | 810  | 8.4620          | 0.063  | 79.8571 |
| 4.3724        | 136.0 | 816  | 8.4619          | 0.0626 | 79.2381 |
| 2.2729        | 137.0 | 822  | 8.4623          | 0.0626 | 79.2381 |
| 2.2375        | 138.0 | 828  | 8.4620          | 0.0625 | 78.2381 |
| 2.0507        | 139.0 | 834  | 8.4617          | 0.0625 | 78.2381 |
| 3.2081        | 140.0 | 840  | 8.4621          | 0.1072 | 78.0952 |
| 3.0478        | 141.0 | 846  | 8.4629          | 0.1072 | 78.0952 |
| 1.6707        | 142.0 | 852  | 8.4628          | 0.1042 | 77.5238 |
| 2.7035        | 143.0 | 858  | 8.4626          | 0.1042 | 77.5238 |
| 2.0088        | 144.0 | 864  | 8.4627          | 0.1042 | 77.5238 |
| 2.2061        | 145.0 | 870  | 8.4619          | 0.1042 | 77.5238 |
| 2.9719        | 146.0 | 876  | 8.4597          | 0.1055 | 76.7143 |
| 1.7429        | 147.0 | 882  | 8.4591          | 0.1335 | 69.0952 |
| 2.0689        | 148.0 | 888  | 8.4590          | 0.1094 | 77.7143 |
| 3.0878        | 149.0 | 894  | 8.4593          | 0.1094 | 77.7143 |
| 2.3762        | 150.0 | 900  | 8.4593          | 0.1083 | 78.381  |
| 1.9409        | 151.0 | 906  | 8.4591          | 0.1083 | 78.381  |
| 2.472         | 152.0 | 912  | 8.4590          | 0.1328 | 70.1905 |
| 2.1888        | 153.0 | 918  | 8.4590          | 0.1341 | 69.619  |
| 2.8783        | 154.0 | 924  | 8.4582          | 0.1341 | 69.619  |
| 2.4719        | 155.0 | 930  | 8.4582          | 0.1318 | 68.9524 |
| 2.4873        | 156.0 | 936  | 8.4579          | 0.1318 | 68.9524 |
| 2.202         | 157.0 | 942  | 8.4576          | 0.1318 | 68.9524 |
| 2.4128        | 158.0 | 948  | 8.4577          | 0.1318 | 68.9524 |
| 1.6922        | 159.0 | 954  | 8.4577          | 0.1318 | 68.9524 |
| 2.5719        | 160.0 | 960  | 8.4582          | 0.1318 | 68.9524 |
| 1.8392        | 161.0 | 966  | 8.4581          | 0.1318 | 68.9524 |
| 2.1349        | 162.0 | 972  | 8.4581          | 0.1318 | 68.9524 |
| 2.0836        | 163.0 | 978  | 8.4586          | 0.1318 | 68.9524 |
| 2.5173        | 164.0 | 984  | 8.4590          | 0.1318 | 68.9524 |
| 1.9422        | 165.0 | 990  | 8.4591          | 0.1318 | 68.9524 |
| 2.4949        | 166.0 | 996  | 8.4591          | 0.1318 | 68.9524 |
| 2.6692        | 167.0 | 1002 | 8.4586          | 0.1318 | 68.9524 |
| 1.5472        | 168.0 | 1008 | 8.4588          | 0.1318 | 68.9524 |
| 5.0693        | 169.0 | 1014 | 8.4589          | 0.1318 | 68.9524 |
| 2.6937        | 170.0 | 1020 | 8.4593          | 0.1318 | 68.9524 |
| 5.0729        | 171.0 | 1026 | 8.4596          | 0.1306 | 69.5238 |
| 2.645         | 172.0 | 1032 | 8.4599          | 0.1306 | 69.5238 |
| 1.671         | 173.0 | 1038 | 8.4600          | 0.1306 | 69.5238 |
| 2.329         | 174.0 | 1044 | 8.4600          | 0.1306 | 69.5238 |
| 2.2443        | 175.0 | 1050 | 8.4597          | 0.1306 | 69.5238 |
| 2.0599        | 176.0 | 1056 | 8.4594          | 0.1306 | 69.5238 |
| 2.0761        | 177.0 | 1062 | 8.4598          | 0.1639 | 60.7619 |
| 2.3301        | 178.0 | 1068 | 8.4595          | 0.1306 | 69.5238 |
| 2.8817        | 179.0 | 1074 | 8.4595          | 0.1306 | 69.5238 |
| 2.3847        | 180.0 | 1080 | 8.4588          | 0.1312 | 69.5238 |
| 2.7967        | 181.0 | 1086 | 8.4586          | 0.1312 | 69.5238 |
| 1.6165        | 182.0 | 1092 | 8.4590          | 0.1308 | 69.6667 |
| 3.2699        | 183.0 | 1098 | 8.4585          | 0.1308 | 69.6667 |
| 2.1596        | 184.0 | 1104 | 8.4587          | 0.1308 | 69.6667 |
| 4.383         | 185.0 | 1110 | 8.4587          | 0.1308 | 69.6667 |
| 2.5019        | 186.0 | 1116 | 8.4587          | 0.1308 | 69.6667 |
| 2.1497        | 187.0 | 1122 | 8.4587          | 0.1308 | 69.6667 |
| 2.7942        | 188.0 | 1128 | 8.4594          | 0.1342 | 69.7619 |
| 2.5737        | 189.0 | 1134 | 8.4595          | 0.1342 | 69.7619 |
| 2.7013        | 190.0 | 1140 | 8.4597          | 0.1342 | 69.7619 |
| 4.7672        | 191.0 | 1146 | 8.4598          | 0.1342 | 69.7619 |
| 4.723         | 192.0 | 1152 | 8.4598          | 0.1342 | 69.7619 |
| 2.2355        | 193.0 | 1158 | 8.4598          | 0.1342 | 69.7619 |
| 1.7872        | 194.0 | 1164 | 8.4599          | 0.1342 | 69.7619 |
| 2.0794        | 195.0 | 1170 | 8.4600          | 0.1342 | 69.7619 |
| 1.6962        | 196.0 | 1176 | 8.4601          | 0.1342 | 69.7619 |
| 2.2855        | 197.0 | 1182 | 8.4602          | 0.1342 | 69.7619 |
| 2.8048        | 198.0 | 1188 | 8.4603          | 0.1346 | 69.619  |
| 1.8135        | 199.0 | 1194 | 8.4603          | 0.1346 | 69.619  |
| 2.395         | 200.0 | 1200 | 8.4603          | 0.1346 | 69.619  |


### Framework versions

- Transformers 4.35.2
- Pytorch 1.13.1+cu117
- Datasets 2.16.1
- Tokenizers 0.15.0