Edit model card

m2m100_418M-finetuned-hi-to-en

This model is a fine-tuned version of facebook/m2m100_418M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1973
  • Bleu: 0.0
  • Gen Len: 5.7184

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
2.6398 0.1100 500 2.5624 2.434 5.8204
2.6877 0.2199 1000 2.4067 6.9764 5.6658
2.6 0.3299 1500 2.3000 4.9574 5.6818
2.5495 0.4399 2000 2.2093 13.5783 5.7773
2.4986 0.5498 2500 2.1232 12.0884 5.7156
2.4475 0.6598 3000 2.0526 0.0 5.7829
2.418 0.7697 3500 1.9804 0.0 5.7902
2.3652 0.8797 4000 1.9253 0.0 5.7564
2.3625 0.9897 4500 1.8681 0.0 5.7984
2.024 1.0996 5000 1.8020 0.0 5.81
2.0017 1.2096 5500 1.7601 0.0 5.7493
2.0036 1.3196 6000 1.7208 0.0 5.8507
1.9983 1.4295 6500 1.6662 0.0 5.742
1.9838 1.5395 7000 1.6273 0.0 5.8033
1.9755 1.6494 7500 1.5914 0.0 5.8629
1.9679 1.7594 8000 1.5436 0.0 5.8751
1.9386 1.8694 8500 1.5154 0.0 5.8762
1.9299 1.9793 9000 1.4725 0.0 5.82
1.6886 2.0893 9500 1.4242 0.0 5.7729
1.6454 2.1993 10000 1.3867 0.0 5.7042
1.6361 2.3092 10500 1.3544 0.0 5.6789
1.6482 2.4192 11000 1.3346 0.0 5.7051
1.6528 2.5291 11500 1.3043 0.0 5.7147
1.6687 2.6391 12000 1.2718 0.0 5.7633
1.6428 2.7491 12500 1.2417 0.0 5.7318
1.6547 2.8590 13000 1.2086 0.0 5.7536
1.6467 2.9690 13500 1.1895 0.0 5.7458
1.4526 3.0790 14000 1.1425 0.0 5.7869
1.3555 3.1889 14500 1.1204 0.0 5.7491
1.4007 3.2989 15000 1.1010 0.0 5.8267
1.3799 3.4088 15500 1.0754 0.0 5.7482
1.401 3.5188 16000 1.0460 0.0 5.7571
1.4093 3.6288 16500 1.0239 0.0 5.7262
1.3997 3.7387 17000 1.0024 0.0 5.692
1.4162 3.8487 17500 0.9869 0.0 5.7273
1.4102 3.9587 18000 0.9558 0.0 5.7613
1.2476 4.0686 18500 0.9296 0.0 5.7113
1.1591 4.1786 19000 0.9163 0.0 5.7651
1.1861 4.2885 19500 0.9017 0.0 5.7498
1.1799 4.3985 20000 0.8841 0.0 5.7884
1.1902 4.5085 20500 0.8635 0.0 5.7613
1.193 4.6184 21000 0.8448 0.0 5.7507
1.1955 4.7284 21500 0.8266 0.0 5.7602
1.2062 4.8384 22000 0.8069 0.0 5.7562
1.2058 4.9483 22500 0.7805 0.0 5.7087
1.0832 5.0583 23000 0.7583 0.0 5.7631
0.9869 5.1682 23500 0.7497 0.0 5.7284
0.9956 5.2782 24000 0.7356 0.0 5.7438
1.0164 5.3882 24500 0.7253 0.0 5.7789
1.017 5.4981 25000 0.7075 0.0 5.7462
1.0365 5.6081 25500 0.6890 0.0 5.7487
1.0421 5.7181 26000 0.6770 0.0 5.7547
1.0344 5.8280 26500 0.6560 0.0 5.7624
1.0286 5.9380 27000 0.6429 0.0 5.7816
0.9637 6.0479 27500 0.6257 0.0 5.7547
0.8297 6.1579 28000 0.6144 0.0 5.7649
0.8625 6.2679 28500 0.6038 0.0 5.7442
0.8587 6.3778 29000 0.5889 0.0 5.7633
0.8732 6.4878 29500 0.5788 0.0 5.7676
0.8738 6.5978 30000 0.5673 0.0 5.7698
0.8938 6.7077 30500 0.5521 0.0 5.7929
0.8797 6.8177 31000 0.5410 0.0 5.7542
0.9055 6.9276 31500 0.5284 0.0 5.7551
0.8408 7.0376 32000 0.5154 0.0 5.754
0.7278 7.1476 32500 0.5106 0.0 5.7602
0.7357 7.2575 33000 0.4958 0.0 5.7422
0.7498 7.3675 33500 0.4906 0.0 5.734
0.7524 7.4775 34000 0.4804 0.0 5.7136
0.7609 7.5874 34500 0.4716 0.0 5.7504
0.7555 7.6974 35000 0.4621 38.6861 5.7544
0.7752 7.8073 35500 0.4493 0.0 5.7429
0.7656 7.9173 36000 0.4387 0.0 5.7484
0.7329 8.0273 36500 0.4281 0.0 5.7364
0.6314 8.1372 37000 0.4251 0.0 5.7453
0.6595 8.2472 37500 0.4161 0.0 5.7393
0.6566 8.3572 38000 0.4125 0.0 5.7502
0.6582 8.4671 38500 0.4043 0.0 5.7364
0.6579 8.5771 39000 0.3962 0.0 5.7422
0.6622 8.6870 39500 0.3878 0.0 5.76
0.6547 8.7970 40000 0.3790 0.0 5.7642
0.6682 8.9070 40500 0.3701 0.0 5.7549
0.6499 9.0169 41000 0.3584 0.0 5.7333
0.541 9.1269 41500 0.3547 0.0 5.7398
0.5621 9.2369 42000 0.3519 0.0 5.7322
0.5673 9.3468 42500 0.3458 0.0 5.7467
0.5618 9.4568 43000 0.3407 0.0 5.7382
0.5704 9.5667 43500 0.3326 0.0 5.7536
0.5816 9.6767 44000 0.3292 0.0 5.7349
0.5892 9.7867 44500 0.3194 0.0 5.7358
0.5796 9.8966 45000 0.3129 0.0 5.7369
0.5807 10.0066 45500 0.3079 0.0 5.7404
0.4786 10.1166 46000 0.3033 0.0 5.7491
0.4863 10.2265 46500 0.2989 0.0 5.7331
0.4979 10.3365 47000 0.2968 0.0 5.732
0.5015 10.4464 47500 0.2917 0.0 5.7229
0.5105 10.5564 48000 0.2886 0.0 5.7398
0.5039 10.6664 48500 0.2830 0.0 5.7173
0.5202 10.7763 49000 0.2789 0.0 5.7218
0.5123 10.8863 49500 0.2742 0.0 5.7276
0.5043 10.9963 50000 0.2670 0.0 5.7191
0.4314 11.1062 50500 0.2661 0.0 5.7364
0.4345 11.2162 51000 0.2612 0.0 5.7262
0.4411 11.3261 51500 0.2592 0.0 5.7233
0.447 11.4361 52000 0.2568 0.0 5.7344
0.453 11.5461 52500 0.2528 0.0 5.7231
0.4485 11.6560 53000 0.2496 0.0 5.7311
0.4472 11.7660 53500 0.2460 0.0 5.7167
0.4567 11.8760 54000 0.2412 0.0 5.7256
0.4528 11.9859 54500 0.2381 0.0 5.7264
0.404 12.0959 55000 0.2342 0.0 5.7187
0.3995 12.2059 55500 0.2333 0.0 5.7293
0.3989 12.3158 56000 0.2317 0.0 5.7104
0.3988 12.4258 56500 0.2284 0.0 5.7242
0.3991 12.5357 57000 0.2261 0.0 5.7276
0.4075 12.6457 57500 0.2234 0.0 5.7198
0.4074 12.7557 58000 0.2207 0.0 5.7262
0.398 12.8656 58500 0.2178 0.0 5.7282
0.4003 12.9756 59000 0.2162 0.0 5.7291
0.374 13.0856 59500 0.2145 0.0 5.7271
0.3749 13.1955 60000 0.2126 0.0 5.7287
0.3589 13.3055 60500 0.2109 0.0 5.7356
0.3734 13.4154 61000 0.2095 0.0 5.7329
0.3706 13.5254 61500 0.2087 0.0 5.7327
0.3781 13.6354 62000 0.2071 0.0 5.7296
0.3735 13.7453 62500 0.2060 0.0 5.7287
0.372 13.8553 63000 0.2039 0.0 5.718
0.3751 13.9653 63500 0.2024 0.0 5.728
0.3573 14.0752 64000 0.2014 0.0 5.7189
0.3322 14.1852 64500 0.2010 0.0 5.7204
0.3359 14.2951 65000 0.2003 0.0 5.7227
0.3533 14.4051 65500 0.1994 0.0 5.7222
0.3489 14.5151 66000 0.1986 0.0 5.7198
0.3358 14.6250 66500 0.1981 0.0 5.7231
0.3424 14.7350 67000 0.1977 0.0 5.72
0.3341 14.8450 67500 0.1976 0.0 5.7209
0.3513 14.9549 68000 0.1973 0.0 5.7184

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
1
Safetensors
Model size
484M params
Tensor type
F32
·

Finetuned from