md_mt5_0109 / README.md
Buseak's picture
End of training
754e30a
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: md_mt5_0109
    results: []

md_mt5_0109

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4790
  • Bleu: 0.457
  • Gen Len: 18.9295

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
13.8417 1.0 975 2.6438 0.563 15.6487
2.8117 2.0 1950 1.4148 0.891 17.2223
1.8883 3.0 2925 1.0693 0.401 18.7582
1.5248 4.0 3900 0.8703 0.4583 18.8508
1.3116 5.0 4875 0.7483 0.4651 18.8856
1.1617 6.0 5850 0.6783 0.4542 18.9005
1.0636 7.0 6825 0.6243 0.459 18.9054
0.9928 8.0 7800 0.5869 0.4707 18.9038
0.9272 9.0 8775 0.5536 0.4563 18.9031
0.8926 10.0 9750 0.5282 0.4606 18.9177
0.8568 11.0 10725 0.5091 0.4577 18.9226
0.8341 12.0 11700 0.4964 0.4482 18.9259
0.8176 13.0 12675 0.4867 0.4539 18.9262
0.806 14.0 13650 0.4812 0.4576 18.9264
0.7945 15.0 14625 0.4790 0.457 18.9295

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.0