Edit model card

iva_mt_wslot-m2m100_418M-en-pl-nomassive

This model is a fine-tuned version of facebook/m2m100_418M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3751
  • Bleu: 61.2295
  • Gen Len: 23.3774

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.4315 1.0 2212 0.3475 57.0703 23.338
0.2976 2.0 4424 0.3295 58.3378 23.5036
0.2153 3.0 6636 0.3347 59.3015 23.5601
0.1542 4.0 8848 0.3420 59.3942 23.4442
0.1177 5.0 11060 0.3454 60.4525 23.3501
0.0858 6.0 13272 0.3529 60.6123 23.3422
0.06 7.0 15484 0.3586 60.1342 23.4478
0.0479 8.0 17696 0.3668 60.6348 23.4163
0.0371 9.0 19908 0.3728 61.3986 23.4436
0.0299 10.0 22120 0.3751 61.2295 23.3774

Framework versions

  • Transformers 4.27.4
  • Pytorch 2.0.0+cu118
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
3

Dataset used to train cartesinus/iva_mt_wslot-m2m100_418M-en-pl-nomassive