metadata

license: apache-2.0
base_model: Buseak/md_mt5_base_boun_split_first_v2
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: md_mt5_base_boun_split_second_v1_retrain_on_first_boun
    results: []

md_mt5_base_boun_split_second_v1_retrain_on_first_boun

This model is a fine-tuned version of Buseak/md_mt5_base_boun_split_first_v2 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.2417
Bleu: 0.7017
Gen Len: 18.7854

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 15

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
0.8304	1.0	975	0.4605	0.5978	18.7538
0.7517	2.0	1950	0.4132	0.6142	18.7521
0.696	3.0	2925	0.3806	0.6384	18.7754
0.6427	4.0	3900	0.3525	0.652	18.7659
0.6183	5.0	4875	0.3270	0.6645	18.7759
0.5737	6.0	5850	0.3105	0.6698	18.781
0.5498	7.0	6825	0.2940	0.6736	18.7764
0.5223	8.0	7800	0.2789	0.6896	18.7823
0.5131	9.0	8775	0.2697	0.6887	18.7808
0.4961	10.0	9750	0.2599	0.6944	18.7823
0.4815	11.0	10725	0.2536	0.696	18.7836
0.4791	12.0	11700	0.2480	0.6994	18.7859
0.4671	13.0	12675	0.2430	0.6992	18.7856
0.469	14.0	13650	0.2417	0.7011	18.7856
0.4642	15.0	14625	0.2417	0.7017	18.7854

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu118
Datasets 2.15.0
Tokenizers 0.15.0