kobart_8_1e-4_datav2_min30_lp5.0_temperature1.0

This model is a fine-tuned version of gogamza/kobart-base-v2 on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.0961
Rouge1: 35.8883
Rouge2: 12.7003
Rougel: 23.3874
Bleu1: 30.2528
Bleu2: 17.5183
Bleu3: 10.2094
Bleu4: 5.6021
Gen Len: 50.1562

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 5.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Bleu1	Bleu2	Bleu3	Bleu4	Gen Len
2.4648	0.19	1000	2.9491	32.241	10.5261	21.21	26.5995	14.7371	7.8411	4.1361	48.303
2.4028	0.38	2000	2.9226	33.8957	11.6309	22.4654	28.1592	15.9817	9.163	5.0564	49.5175
2.4109	0.57	3000	2.9092	33.9997	11.4619	22.2822	28.0021	15.7774	8.7258	4.5887	44.6807
2.3846	0.76	4000	2.8763	31.8881	10.1122	21.1754	25.4518	13.7126	7.4549	3.9979	40.9161
2.2972	0.94	5000	2.8441	33.4146	11.8371	22.7219	27.1678	15.4977	9.1783	5.3303	43.8765
2.0162	1.13	6000	2.8372	34.9461	11.8978	22.7877	28.9743	16.3778	9.2932	5.0534	47.1585
1.9816	1.32	7000	2.8630	33.1249	10.8834	22.0846	27.0042	14.9508	8.3482	4.5422	44.676
2.0172	1.51	8000	2.7998	34.1663	11.5471	22.8156	28.0367	15.7969	8.6235	4.5914	44.9254
2.017	1.7	9000	2.7865	33.3775	11.194	22.6083	26.7485	14.9797	8.2559	4.279	41.5828
1.9734	1.89	10000	2.7532	34.7147	12.353	23.0917	28.8012	16.7472	9.7079	5.5416	47.9883
1.5937	2.08	11000	2.8433	34.9402	12.2318	23.2483	28.8006	16.5212	9.6008	5.3947	45.2401
1.6112	2.27	12000	2.8377	34.9291	12.2349	23.278	28.8423	16.539	9.7674	5.4267	44.7599
1.603	2.45	13000	2.8223	35.3837	12.5491	23.5272	29.3683	16.9828	9.6955	5.3166	47.6037
1.6274	2.64	14000	2.8220	34.0515	11.7884	22.829	27.6635	15.8021	8.9724	4.9314	44.1235
1.6435	2.83	15000	2.8139	34.9239	12.2122	22.9939	29.1796	16.763	9.5513	5.174	46.7832
1.238	3.02	16000	2.9615	35.456	12.3012	23.3111	29.8676	17.0768	9.8694	5.4376	51.1935
1.2767	3.21	17000	2.9781	35.2632	12.1441	23.2537	29.1438	16.6216	9.353	5.1593	46.0793
1.2868	3.4	18000	2.9723	34.6808	11.9638	22.9058	28.9988	16.4994	9.3619	5.1178	47.4732
1.2842	3.59	19000	2.9688	35.3792	12.5174	23.2012	29.6403	17.1517	9.9507	5.5561	49.1515
1.2931	3.78	20000	2.9694	35.7525	12.8025	23.5228	29.8102	17.3544	10.239	5.6637	49.1189
1.2733	3.97	21000	2.9618	35.8931	12.627	23.5571	30.0482	17.2582	9.8412	5.4747	48.5082
0.963	4.15	22000	3.1113	35.7523	12.7633	23.3127	30.0193	17.4211	10.2596	5.853	51.6993
0.9563	4.34	23000	3.1031	35.8437	12.6323	23.6011	30.0923	17.4089	9.9831	5.5993	48.7646
0.992	4.53	24000	3.1016	36.1067	13.3428	24.0267	30.0275	17.8733	10.6929	6.2491	52.0373
0.9722	4.72	25000	3.0956	35.4406	12.4799	23.3418	29.5123	17.0292	9.7401	5.3586	48.8974
0.9519	4.91	26000	3.0961	35.8883	12.7003	23.3874	30.2528	17.5183	10.2094	5.6021	50.1562

Framework versions

Transformers 4.25.1
Pytorch 1.13.0+cu117
Datasets 2.7.1
Tokenizers 0.13.2

nlp04
/

kobart_8_1e-4_datav2_min30_lp5.0_temperature1.0

kobart_8_1e-4_datav2_min30_lp5.0_temperature1.0

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results