Edit model card

md_mt5_0109

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4790
  • Bleu: 0.457
  • Gen Len: 18.9295

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
13.8417 1.0 975 2.6438 0.563 15.6487
2.8117 2.0 1950 1.4148 0.891 17.2223
1.8883 3.0 2925 1.0693 0.401 18.7582
1.5248 4.0 3900 0.8703 0.4583 18.8508
1.3116 5.0 4875 0.7483 0.4651 18.8856
1.1617 6.0 5850 0.6783 0.4542 18.9005
1.0636 7.0 6825 0.6243 0.459 18.9054
0.9928 8.0 7800 0.5869 0.4707 18.9038
0.9272 9.0 8775 0.5536 0.4563 18.9031
0.8926 10.0 9750 0.5282 0.4606 18.9177
0.8568 11.0 10725 0.5091 0.4577 18.9226
0.8341 12.0 11700 0.4964 0.4482 18.9259
0.8176 13.0 12675 0.4867 0.4539 18.9262
0.806 14.0 13650 0.4812 0.4576 18.9264
0.7945 15.0 14625 0.4790 0.457 18.9295

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
0
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from