Edit model card

md_mt5_0109_v5

This model is a fine-tuned version of Buseak/md_mt5_0109_v4 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0832
  • Bleu: 0.637
  • Gen Len: 18.9449

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.2251 1.0 975 0.1136 0.611 18.9487
0.2151 2.0 1950 0.1080 0.6183 18.9479
0.2109 3.0 2925 0.1043 0.6211 18.949
0.2033 4.0 3900 0.1003 0.6203 18.9456
0.2002 5.0 4875 0.0974 0.6223 18.9451
0.1948 6.0 5850 0.0946 0.6246 18.9438
0.1935 7.0 6825 0.0924 0.63 18.9487
0.1843 8.0 7800 0.0902 0.6345 18.9438
0.1834 9.0 8775 0.0882 0.6307 18.9446
0.1849 10.0 9750 0.0862 0.6344 18.9459
0.1851 11.0 10725 0.0856 0.6344 18.9436
0.1826 12.0 11700 0.0843 0.6377 18.9449
0.1834 13.0 12675 0.0838 0.635 18.9441
0.1836 14.0 13650 0.0834 0.637 18.9449
0.185 15.0 14625 0.0832 0.637 18.9449

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
0
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from