Edit model card

md_mt5_0109_v4

This model is a fine-tuned version of Buseak/md_mt5_0109_v3 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1076
  • Bleu: 0.5986
  • Gen Len: 18.9367

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.319 1.0 975 0.1525 0.5661 18.9408
0.3086 2.0 1950 0.1450 0.5746 18.9364
0.2952 3.0 2925 0.1376 0.5786 18.9369
0.2822 4.0 3900 0.1322 0.5793 18.9392
0.272 5.0 4875 0.1272 0.5836 18.94
0.2676 6.0 5850 0.1230 0.5854 18.9377
0.255 7.0 6825 0.1205 0.5873 18.9369
0.2516 8.0 7800 0.1161 0.5901 18.941
0.247 9.0 8775 0.1146 0.5924 18.939
0.2426 10.0 9750 0.1127 0.5947 18.9374
0.234 11.0 10725 0.1108 0.5962 18.9374
0.2359 12.0 11700 0.1091 0.5983 18.9374
0.2347 13.0 12675 0.1086 0.5983 18.9377
0.2271 14.0 13650 0.1077 0.5984 18.9367
0.2318 15.0 14625 0.1076 0.5986 18.9367

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
0
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from