Edit model card

md_mt5_base_boun_split_second_v1_retrain_on_boun_1612

This model is a fine-tuned version of Buseak/md_mt5_base_boun_split_second_v1_retrain_on_second_imst on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1146
  • Bleu: 0.7971
  • Gen Len: 18.8069

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.3542 1.0 975 0.1679 0.7453 18.7959
0.3354 2.0 1950 0.1553 0.7492 18.8033
0.3183 3.0 2925 0.1500 0.7576 18.7972
0.3013 4.0 3900 0.1421 0.7618 18.7969
0.2912 5.0 4875 0.1364 0.7673 18.8005
0.2816 6.0 5850 0.1321 0.7732 18.8003
0.2753 7.0 6825 0.1287 0.7825 18.8067
0.2718 8.0 7800 0.1252 0.7866 18.8051
0.2631 9.0 8775 0.1222 0.7885 18.799
0.2576 10.0 9750 0.1197 0.793 18.8041
0.2541 11.0 10725 0.1182 0.7935 18.8056
0.2505 12.0 11700 0.1165 0.7956 18.8056
0.2492 13.0 12675 0.1156 0.7971 18.8056
0.2534 14.0 13650 0.1146 0.7964 18.8069
0.2455 15.0 14625 0.1146 0.7971 18.8069

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
1
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from