Edit model card

md_mt5_0109_v8

This model is a fine-tuned version of Buseak/md_mt5_0109_v7 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0444
  • Bleu: 0.6614
  • Gen Len: 18.9444

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.1129 1.0 975 0.0597 0.6517 18.9418
0.1094 2.0 1950 0.0567 0.654 18.9372
0.1101 3.0 2925 0.0543 0.657 18.9415
0.1097 4.0 3900 0.0520 0.6555 18.9446
0.1091 5.0 4875 0.0511 0.6571 18.9446
0.1102 6.0 5850 0.0497 0.6591 18.9451
0.1056 7.0 6825 0.0489 0.6585 18.9444
0.1088 8.0 7800 0.0470 0.6595 18.9436
0.1103 9.0 8775 0.0467 0.6589 18.9415
0.1078 10.0 9750 0.0462 0.66 18.9423
0.1106 11.0 10725 0.0451 0.6605 18.9431
0.1112 12.0 11700 0.0448 0.6607 18.9444
0.1134 13.0 12675 0.0447 0.6607 18.9395
0.1183 14.0 13650 0.0446 0.6602 18.9408
0.1188 15.0 14625 0.0444 0.6614 18.9444

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
5
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from