File size: 11,457 Bytes
778ebc6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
---
license: mit
base_model: facebook/m2m100_418M
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: m2m100_418M-finetuned-hi-to-en
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# m2m100_418M-finetuned-hi-to-en

This model is a fine-tuned version of [facebook/m2m100_418M](https://huggingface.co/facebook/m2m100_418M) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1973
- Bleu: 0.0
- Gen Len: 5.7184

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 15
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch   | Step  | Validation Loss | Bleu    | Gen Len |
|:-------------:|:-------:|:-----:|:---------------:|:-------:|:-------:|
| 2.6398        | 0.1100  | 500   | 2.5624          | 2.434   | 5.8204  |
| 2.6877        | 0.2199  | 1000  | 2.4067          | 6.9764  | 5.6658  |
| 2.6           | 0.3299  | 1500  | 2.3000          | 4.9574  | 5.6818  |
| 2.5495        | 0.4399  | 2000  | 2.2093          | 13.5783 | 5.7773  |
| 2.4986        | 0.5498  | 2500  | 2.1232          | 12.0884 | 5.7156  |
| 2.4475        | 0.6598  | 3000  | 2.0526          | 0.0     | 5.7829  |
| 2.418         | 0.7697  | 3500  | 1.9804          | 0.0     | 5.7902  |
| 2.3652        | 0.8797  | 4000  | 1.9253          | 0.0     | 5.7564  |
| 2.3625        | 0.9897  | 4500  | 1.8681          | 0.0     | 5.7984  |
| 2.024         | 1.0996  | 5000  | 1.8020          | 0.0     | 5.81    |
| 2.0017        | 1.2096  | 5500  | 1.7601          | 0.0     | 5.7493  |
| 2.0036        | 1.3196  | 6000  | 1.7208          | 0.0     | 5.8507  |
| 1.9983        | 1.4295  | 6500  | 1.6662          | 0.0     | 5.742   |
| 1.9838        | 1.5395  | 7000  | 1.6273          | 0.0     | 5.8033  |
| 1.9755        | 1.6494  | 7500  | 1.5914          | 0.0     | 5.8629  |
| 1.9679        | 1.7594  | 8000  | 1.5436          | 0.0     | 5.8751  |
| 1.9386        | 1.8694  | 8500  | 1.5154          | 0.0     | 5.8762  |
| 1.9299        | 1.9793  | 9000  | 1.4725          | 0.0     | 5.82    |
| 1.6886        | 2.0893  | 9500  | 1.4242          | 0.0     | 5.7729  |
| 1.6454        | 2.1993  | 10000 | 1.3867          | 0.0     | 5.7042  |
| 1.6361        | 2.3092  | 10500 | 1.3544          | 0.0     | 5.6789  |
| 1.6482        | 2.4192  | 11000 | 1.3346          | 0.0     | 5.7051  |
| 1.6528        | 2.5291  | 11500 | 1.3043          | 0.0     | 5.7147  |
| 1.6687        | 2.6391  | 12000 | 1.2718          | 0.0     | 5.7633  |
| 1.6428        | 2.7491  | 12500 | 1.2417          | 0.0     | 5.7318  |
| 1.6547        | 2.8590  | 13000 | 1.2086          | 0.0     | 5.7536  |
| 1.6467        | 2.9690  | 13500 | 1.1895          | 0.0     | 5.7458  |
| 1.4526        | 3.0790  | 14000 | 1.1425          | 0.0     | 5.7869  |
| 1.3555        | 3.1889  | 14500 | 1.1204          | 0.0     | 5.7491  |
| 1.4007        | 3.2989  | 15000 | 1.1010          | 0.0     | 5.8267  |
| 1.3799        | 3.4088  | 15500 | 1.0754          | 0.0     | 5.7482  |
| 1.401         | 3.5188  | 16000 | 1.0460          | 0.0     | 5.7571  |
| 1.4093        | 3.6288  | 16500 | 1.0239          | 0.0     | 5.7262  |
| 1.3997        | 3.7387  | 17000 | 1.0024          | 0.0     | 5.692   |
| 1.4162        | 3.8487  | 17500 | 0.9869          | 0.0     | 5.7273  |
| 1.4102        | 3.9587  | 18000 | 0.9558          | 0.0     | 5.7613  |
| 1.2476        | 4.0686  | 18500 | 0.9296          | 0.0     | 5.7113  |
| 1.1591        | 4.1786  | 19000 | 0.9163          | 0.0     | 5.7651  |
| 1.1861        | 4.2885  | 19500 | 0.9017          | 0.0     | 5.7498  |
| 1.1799        | 4.3985  | 20000 | 0.8841          | 0.0     | 5.7884  |
| 1.1902        | 4.5085  | 20500 | 0.8635          | 0.0     | 5.7613  |
| 1.193         | 4.6184  | 21000 | 0.8448          | 0.0     | 5.7507  |
| 1.1955        | 4.7284  | 21500 | 0.8266          | 0.0     | 5.7602  |
| 1.2062        | 4.8384  | 22000 | 0.8069          | 0.0     | 5.7562  |
| 1.2058        | 4.9483  | 22500 | 0.7805          | 0.0     | 5.7087  |
| 1.0832        | 5.0583  | 23000 | 0.7583          | 0.0     | 5.7631  |
| 0.9869        | 5.1682  | 23500 | 0.7497          | 0.0     | 5.7284  |
| 0.9956        | 5.2782  | 24000 | 0.7356          | 0.0     | 5.7438  |
| 1.0164        | 5.3882  | 24500 | 0.7253          | 0.0     | 5.7789  |
| 1.017         | 5.4981  | 25000 | 0.7075          | 0.0     | 5.7462  |
| 1.0365        | 5.6081  | 25500 | 0.6890          | 0.0     | 5.7487  |
| 1.0421        | 5.7181  | 26000 | 0.6770          | 0.0     | 5.7547  |
| 1.0344        | 5.8280  | 26500 | 0.6560          | 0.0     | 5.7624  |
| 1.0286        | 5.9380  | 27000 | 0.6429          | 0.0     | 5.7816  |
| 0.9637        | 6.0479  | 27500 | 0.6257          | 0.0     | 5.7547  |
| 0.8297        | 6.1579  | 28000 | 0.6144          | 0.0     | 5.7649  |
| 0.8625        | 6.2679  | 28500 | 0.6038          | 0.0     | 5.7442  |
| 0.8587        | 6.3778  | 29000 | 0.5889          | 0.0     | 5.7633  |
| 0.8732        | 6.4878  | 29500 | 0.5788          | 0.0     | 5.7676  |
| 0.8738        | 6.5978  | 30000 | 0.5673          | 0.0     | 5.7698  |
| 0.8938        | 6.7077  | 30500 | 0.5521          | 0.0     | 5.7929  |
| 0.8797        | 6.8177  | 31000 | 0.5410          | 0.0     | 5.7542  |
| 0.9055        | 6.9276  | 31500 | 0.5284          | 0.0     | 5.7551  |
| 0.8408        | 7.0376  | 32000 | 0.5154          | 0.0     | 5.754   |
| 0.7278        | 7.1476  | 32500 | 0.5106          | 0.0     | 5.7602  |
| 0.7357        | 7.2575  | 33000 | 0.4958          | 0.0     | 5.7422  |
| 0.7498        | 7.3675  | 33500 | 0.4906          | 0.0     | 5.734   |
| 0.7524        | 7.4775  | 34000 | 0.4804          | 0.0     | 5.7136  |
| 0.7609        | 7.5874  | 34500 | 0.4716          | 0.0     | 5.7504  |
| 0.7555        | 7.6974  | 35000 | 0.4621          | 38.6861 | 5.7544  |
| 0.7752        | 7.8073  | 35500 | 0.4493          | 0.0     | 5.7429  |
| 0.7656        | 7.9173  | 36000 | 0.4387          | 0.0     | 5.7484  |
| 0.7329        | 8.0273  | 36500 | 0.4281          | 0.0     | 5.7364  |
| 0.6314        | 8.1372  | 37000 | 0.4251          | 0.0     | 5.7453  |
| 0.6595        | 8.2472  | 37500 | 0.4161          | 0.0     | 5.7393  |
| 0.6566        | 8.3572  | 38000 | 0.4125          | 0.0     | 5.7502  |
| 0.6582        | 8.4671  | 38500 | 0.4043          | 0.0     | 5.7364  |
| 0.6579        | 8.5771  | 39000 | 0.3962          | 0.0     | 5.7422  |
| 0.6622        | 8.6870  | 39500 | 0.3878          | 0.0     | 5.76    |
| 0.6547        | 8.7970  | 40000 | 0.3790          | 0.0     | 5.7642  |
| 0.6682        | 8.9070  | 40500 | 0.3701          | 0.0     | 5.7549  |
| 0.6499        | 9.0169  | 41000 | 0.3584          | 0.0     | 5.7333  |
| 0.541         | 9.1269  | 41500 | 0.3547          | 0.0     | 5.7398  |
| 0.5621        | 9.2369  | 42000 | 0.3519          | 0.0     | 5.7322  |
| 0.5673        | 9.3468  | 42500 | 0.3458          | 0.0     | 5.7467  |
| 0.5618        | 9.4568  | 43000 | 0.3407          | 0.0     | 5.7382  |
| 0.5704        | 9.5667  | 43500 | 0.3326          | 0.0     | 5.7536  |
| 0.5816        | 9.6767  | 44000 | 0.3292          | 0.0     | 5.7349  |
| 0.5892        | 9.7867  | 44500 | 0.3194          | 0.0     | 5.7358  |
| 0.5796        | 9.8966  | 45000 | 0.3129          | 0.0     | 5.7369  |
| 0.5807        | 10.0066 | 45500 | 0.3079          | 0.0     | 5.7404  |
| 0.4786        | 10.1166 | 46000 | 0.3033          | 0.0     | 5.7491  |
| 0.4863        | 10.2265 | 46500 | 0.2989          | 0.0     | 5.7331  |
| 0.4979        | 10.3365 | 47000 | 0.2968          | 0.0     | 5.732   |
| 0.5015        | 10.4464 | 47500 | 0.2917          | 0.0     | 5.7229  |
| 0.5105        | 10.5564 | 48000 | 0.2886          | 0.0     | 5.7398  |
| 0.5039        | 10.6664 | 48500 | 0.2830          | 0.0     | 5.7173  |
| 0.5202        | 10.7763 | 49000 | 0.2789          | 0.0     | 5.7218  |
| 0.5123        | 10.8863 | 49500 | 0.2742          | 0.0     | 5.7276  |
| 0.5043        | 10.9963 | 50000 | 0.2670          | 0.0     | 5.7191  |
| 0.4314        | 11.1062 | 50500 | 0.2661          | 0.0     | 5.7364  |
| 0.4345        | 11.2162 | 51000 | 0.2612          | 0.0     | 5.7262  |
| 0.4411        | 11.3261 | 51500 | 0.2592          | 0.0     | 5.7233  |
| 0.447         | 11.4361 | 52000 | 0.2568          | 0.0     | 5.7344  |
| 0.453         | 11.5461 | 52500 | 0.2528          | 0.0     | 5.7231  |
| 0.4485        | 11.6560 | 53000 | 0.2496          | 0.0     | 5.7311  |
| 0.4472        | 11.7660 | 53500 | 0.2460          | 0.0     | 5.7167  |
| 0.4567        | 11.8760 | 54000 | 0.2412          | 0.0     | 5.7256  |
| 0.4528        | 11.9859 | 54500 | 0.2381          | 0.0     | 5.7264  |
| 0.404         | 12.0959 | 55000 | 0.2342          | 0.0     | 5.7187  |
| 0.3995        | 12.2059 | 55500 | 0.2333          | 0.0     | 5.7293  |
| 0.3989        | 12.3158 | 56000 | 0.2317          | 0.0     | 5.7104  |
| 0.3988        | 12.4258 | 56500 | 0.2284          | 0.0     | 5.7242  |
| 0.3991        | 12.5357 | 57000 | 0.2261          | 0.0     | 5.7276  |
| 0.4075        | 12.6457 | 57500 | 0.2234          | 0.0     | 5.7198  |
| 0.4074        | 12.7557 | 58000 | 0.2207          | 0.0     | 5.7262  |
| 0.398         | 12.8656 | 58500 | 0.2178          | 0.0     | 5.7282  |
| 0.4003        | 12.9756 | 59000 | 0.2162          | 0.0     | 5.7291  |
| 0.374         | 13.0856 | 59500 | 0.2145          | 0.0     | 5.7271  |
| 0.3749        | 13.1955 | 60000 | 0.2126          | 0.0     | 5.7287  |
| 0.3589        | 13.3055 | 60500 | 0.2109          | 0.0     | 5.7356  |
| 0.3734        | 13.4154 | 61000 | 0.2095          | 0.0     | 5.7329  |
| 0.3706        | 13.5254 | 61500 | 0.2087          | 0.0     | 5.7327  |
| 0.3781        | 13.6354 | 62000 | 0.2071          | 0.0     | 5.7296  |
| 0.3735        | 13.7453 | 62500 | 0.2060          | 0.0     | 5.7287  |
| 0.372         | 13.8553 | 63000 | 0.2039          | 0.0     | 5.718   |
| 0.3751        | 13.9653 | 63500 | 0.2024          | 0.0     | 5.728   |
| 0.3573        | 14.0752 | 64000 | 0.2014          | 0.0     | 5.7189  |
| 0.3322        | 14.1852 | 64500 | 0.2010          | 0.0     | 5.7204  |
| 0.3359        | 14.2951 | 65000 | 0.2003          | 0.0     | 5.7227  |
| 0.3533        | 14.4051 | 65500 | 0.1994          | 0.0     | 5.7222  |
| 0.3489        | 14.5151 | 66000 | 0.1986          | 0.0     | 5.7198  |
| 0.3358        | 14.6250 | 66500 | 0.1981          | 0.0     | 5.7231  |
| 0.3424        | 14.7350 | 67000 | 0.1977          | 0.0     | 5.72    |
| 0.3341        | 14.8450 | 67500 | 0.1976          | 0.0     | 5.7209  |
| 0.3513        | 14.9549 | 68000 | 0.1973          | 0.0     | 5.7184  |


### Framework versions

- Transformers 4.40.2
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1