Edit model card

md_mt5_base_boun_split_second_v1_retrain_on_second_imst_v2_1112

This model is a fine-tuned version of Buseak/md_mt5_base_boun_split_second_v1_retrain_on_second_imst on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1052
  • Bleu: 2.2599
  • Gen Len: 18.7745

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.2959 1.0 916 0.1507 2.123 18.7726
0.2784 2.0 1832 0.1420 2.1538 18.771
0.2741 3.0 2748 0.1376 2.1514 18.7701
0.2635 4.0 3664 0.1304 2.1802 18.7745
0.2551 5.0 4580 0.1255 2.1975 18.7775
0.2572 6.0 5496 0.1225 2.2077 18.7775
0.2504 7.0 6412 0.1188 2.2158 18.7781
0.2449 8.0 7328 0.1152 2.2287 18.7781
0.2425 9.0 8244 0.1121 2.2366 18.7748
0.2383 10.0 9160 0.1100 2.2437 18.777
0.2458 11.0 10076 0.1081 2.2543 18.7786
0.2334 12.0 10992 0.1075 2.2505 18.7786
0.2415 13.0 11908 0.1060 2.2519 18.775
0.2401 14.0 12824 0.1052 2.2579 18.7742
0.2435 15.0 13740 0.1052 2.2599 18.7745

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from