Edit model card

m2m100_lr_2e5_gradd_accum_1

This model is a fine-tuned version of facebook/m2m100_418M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2511
  • Bleu: 10.7253
  • Gen Len: 45.9543
  • Meteor: 0.3079
  • Chrf: 33.934

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 12
  • eval_batch_size: 12
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 32.0

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len Meteor Chrf
3.6632 3.97 2406 2.8977 7.6826 49.2935 0.2567 28.2679
2.1329 7.94 4812 2.7600 10.0915 47.2241 0.2999 32.4294
1.552 11.91 7218 2.8218 10.3506 45.6178 0.3041 33.2923
1.1577 15.88 9624 2.9258 10.3313 46.668 0.3058 33.4639
0.8781 19.85 12030 3.0435 10.5266 46.2384 0.3063 33.6649
0.6935 23.82 14436 3.1381 10.4391 46.0441 0.3062 33.7686
0.5683 27.79 16842 3.2178 10.6801 45.8612 0.309 33.8163
0.5037 31.76 19248 3.2511 10.7253 45.9543 0.3079 33.934

Framework versions

  • Transformers 4.30.2
  • Pytorch 1.11.0+cu113
  • Datasets 2.10.0
  • Tokenizers 0.12.1
Downloads last month
1