Edit model card

opus-mt-en-bkm

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ro on the arrow dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1790
  • Bleu: 17.7574
  • Gen Len: 58.4209

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
2.1758 1.0 1113 1.8681 4.1739 58.6351
1.8143 2.0 2226 1.6288 6.2869 62.8396
1.635 3.0 3339 1.4789 7.8756 58.5721
1.4988 4.0 4452 1.3930 9.2821 59.5793
1.3753 5.0 5565 1.3288 10.4942 58.924
1.3015 6.0 6678 1.2773 11.3724 60.0849
1.2424 7.0 7791 1.2419 12.1525 60.724
1.1758 8.0 8904 1.2131 12.5595 58.5216
1.1263 9.0 10017 1.1882 13.4807 58.1827
1.0781 10.0 11130 1.1720 13.6583 56.953
1.0377 11.0 12243 1.1571 14.2744 58.1146
1.0014 12.0 13356 1.1437 14.5804 57.9928
0.9737 13.0 14469 1.1326 14.9612 57.4652
0.9384 14.0 15582 1.1263 15.1647 58.4813
0.9061 15.0 16695 1.1262 15.3948 57.8562
0.8854 16.0 17808 1.1164 15.7348 57.8652
0.8657 17.0 18921 1.1179 15.9306 57.5578
0.837 18.0 20034 1.1140 16.0704 58.2836
0.8208 19.0 21147 1.1135 16.1836 57.6796
0.7919 20.0 22260 1.1117 16.4418 57.7658
0.7645 21.0 23373 1.1134 16.3838 58.2189
0.7519 22.0 24486 1.1157 16.4369 57.7701
0.7375 23.0 25599 1.1178 16.4328 57.5811
0.7221 24.0 26712 1.1186 16.8289 57.3139
0.7009 25.0 27825 1.1190 16.9092 57.9038
0.6882 26.0 28938 1.1254 17.0946 58.229
0.6778 27.0 30051 1.1246 17.1689 58.5953
0.6668 28.0 31164 1.1281 17.1734 58.1258
0.6589 29.0 32277 1.1322 16.9988 58.0218
0.639 30.0 33390 1.1297 17.2725 58.3717
0.6318 31.0 34503 1.1392 17.3926 57.9088
0.6174 32.0 35616 1.1429 17.385 58.6474
0.6105 33.0 36729 1.1443 17.4034 58.7521
0.5953 34.0 37842 1.1485 17.4571 58.4733
0.5897 35.0 38955 1.1491 17.4854 58.9544
0.5807 36.0 40068 1.1572 17.544 58.1013
0.5774 37.0 41181 1.1588 17.5858 58.4694
0.5633 38.0 42294 1.1588 17.604 58.2328
0.5565 39.0 43407 1.1640 17.7342 58.3148
0.5556 40.0 44520 1.1642 17.6596 58.6809
0.5469 41.0 45633 1.1671 17.5064 58.1013
0.5428 42.0 46746 1.1686 17.7473 58.5171
0.5342 43.0 47859 1.1719 17.749 58.8335
0.5292 44.0 48972 1.1730 17.6552 58.4492
0.5314 45.0 50085 1.1728 17.7932 58.6007
0.5283 46.0 51198 1.1770 17.7351 58.4564
0.5252 47.0 52311 1.1778 17.803 58.5793
0.5227 48.0 53424 1.1782 17.7729 58.3533
0.5206 49.0 54537 1.1788 17.7547 58.5108
0.5186 50.0 55650 1.1790 17.7574 58.4209

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
10
Safetensors
Model size
74.7M params
Tensor type
F32
·

Finetuned from

Evaluation results