Edit model card

mt5.baseline

This model is a fine-tuned version of samzirbo/mT5.en-es.pretrained on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5093
  • Bleu: 38.6464
  • Meteor: 0.661
  • Chrf++: 60.6878

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • training_steps: 30000

Training results

Training Loss Epoch Step Validation Loss Bleu Meteor Chrf++
4.0484 0.3215 3000 2.1130 29.7312 0.5872 53.2622
2.3309 0.6431 6000 1.8472 33.4852 0.6209 56.6127
2.0987 0.9646 9000 1.7299 35.1261 0.6355 58.0524
1.9355 1.2862 12000 1.6594 36.3851 0.6449 58.9991
1.8568 1.6077 15000 1.5978 37.0844 0.6499 59.4457
1.8039 1.9293 18000 1.5601 37.7628 0.6562 60.145
1.7271 2.2508 21000 1.5298 38.1387 0.6572 60.3042
1.6984 2.5723 24000 1.5148 38.5117 0.66 60.5765
1.6846 2.8939 27000 1.5096 38.5563 0.6604 60.6276
1.6687 3.2154 30000 1.5093 38.6464 0.661 60.6878

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
3,895

Finetuned from