Edit model card

baseline

This model is a fine-tuned version of samzirbo/mT5.en-es.pretrained on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1724
  • Bleu: 43.677
  • Meteor: 0.6901
  • Chrf++: 62.5868

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • training_steps: 50000

Training results

Training Loss Epoch Step Validation Loss Bleu Meteor Chrf++
4.3403 0.26 2500 2.0224 27.59 0.5546 49.0181
2.4264 0.53 5000 1.7329 32.4582 0.6023 53.8983
2.1747 0.79 7500 1.5850 35.9783 0.6246 56.295
2.0285 1.05 10000 1.5016 37.3015 0.638 57.5591
1.9104 1.32 12500 1.4356 38.832 0.6501 58.6692
1.8547 1.58 15000 1.3784 39.7112 0.6593 59.4218
1.8013 1.84 17500 1.3481 39.9137 0.6608 59.7434
1.7372 2.11 20000 1.3070 40.8569 0.6679 60.4092
1.6845 2.37 22500 1.2847 41.5254 0.6721 60.8743
1.6611 2.64 25000 1.2574 42.0492 0.6767 61.2287
1.6382 2.9 27500 1.2372 42.2626 0.6806 61.5161
1.595 3.16 30000 1.2220 42.827 0.6835 61.9015
1.5645 3.43 32500 1.2088 42.909 0.6828 61.8832
1.5557 3.69 35000 1.1981 43.2386 0.6852 62.1239
1.5473 3.95 37500 1.1862 43.4076 0.6866 62.3625
1.5147 4.22 40000 1.1797 43.5469 0.6876 62.3958
1.5089 4.48 42500 1.1765 43.5486 0.689 62.5208
1.5032 4.74 45000 1.1738 43.6415 0.6893 62.5473
1.4998 5.01 47500 1.1724 43.6758 0.6898 62.581
1.4905 5.27 50000 1.1724 43.677 0.6901 62.5868

Framework versions

  • Transformers 4.38.0
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.15.2
Downloads last month
87
Safetensors
Model size
60.4M params
Tensor type
F32
·

Finetuned from