Edit model card

iva_mt_wslot-m2m100_418M-en-pl-massive_filtered

This model is a fine-tuned version of facebook/m2m100_418M on the iva_mt_wslot-exp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0183
  • Bleu: 62.374
  • Gen Len: 22.2291

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.02 1.0 4207 0.0178 58.2501 22.1015
0.0138 2.0 8414 0.0165 59.7044 22.3942
0.0098 3.0 12621 0.0164 60.5111 22.2572
0.0076 4.0 16828 0.0165 61.0963 22.2659
0.0056 5.0 21035 0.0168 61.2977 22.1654
0.0041 6.0 25242 0.0171 61.6892 22.2384
0.0032 7.0 29449 0.0177 61.5857 22.2927
0.0024 8.0 33656 0.0179 62.1501 22.1389
0.0018 9.0 37863 0.0182 62.3106 22.3325
0.0015 10.0 42070 0.0183 62.374 22.2291

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0+cu118
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
8

Evaluation results