kobart_8_5.6e-5_min30_lp4_sample

This model is a fine-tuned version of gogamza/kobart-base-v2 on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.8230
Rouge1: 36.1016
Rouge2: 12.8106
Rougel: 23.6405
Bleu1: 30.2521
Bleu2: 17.5293
Bleu3: 10.3861
Bleu4: 5.7474
Gen Len: 50.6713

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 5.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Bleu1	Bleu2	Bleu3	Bleu4	Gen Len
2.527	0.19	1000	3.0014	31.7677	9.8681	20.6405	26.0107	13.8904	7.2892	3.6709	48.8228
2.4185	0.38	2000	2.8850	32.5931	10.727	21.3403	26.4666	14.6488	8.0738	4.2365	44.3497
2.3546	0.57	3000	2.8285	32.7686	11.0203	21.8204	26.9575	15.0974	8.365	4.6586	45.0956
2.2617	0.76	4000	2.7775	34.1375	12.1264	22.506	28.082	16.2606	9.486	5.3768	49.4872
2.2106	0.94	5000	2.7396	33.4733	11.2845	22.5126	27.3856	15.3472	8.567	4.6969	44.2401
2.0022	1.13	6000	2.7534	33.9237	11.84	22.5473	27.5719	15.8555	9.0337	5.1397	45.2611
1.9749	1.32	7000	2.7258	35.1741	12.4088	22.8272	29.4193	17.0056	9.9196	5.5038	50.3124
1.993	1.51	8000	2.7026	35.8572	13.2373	23.5429	30.1024	17.7802	10.6998	6.133	51.9953
1.9461	1.7	9000	2.6379	35.0541	12.4639	23.4095	28.7614	16.7411	9.5243	5.3422	45.7319
1.9159	1.89	10000	2.6071	35.3005	13.0834	23.5232	29.2371	17.3405	10.0603	5.9913	46.3846
1.6347	2.08	11000	2.6773	35.6737	12.7968	23.5884	30.0898	17.5699	10.0439	5.984	51.4755
1.6179	2.27	12000	2.6652	35.6258	13.0066	24.1646	29.4431	17.3774	10.4055	6.0368	47.2121
1.613	2.45	13000	2.6667	35.6093	12.3267	23.4513	29.6818	17.0819	9.7674	5.4192	48.1632
1.6642	2.64	14000	2.6516	36.1341	12.9256	23.6283	30.3579	17.689	10.3152	5.6037	47.9534
1.6432	2.83	15000	2.6498	37.3996	14.1165	24.4384	31.3868	18.8878	11.6758	7.0218	51.0769
1.371	3.02	16000	2.7315	36.2931	13.1544	23.6259	30.5586	17.9341	10.2612	5.6973	53.0606
1.374	3.21	17000	2.7438	36.2938	13.3253	23.8868	30.2665	17.9543	10.6402	6.1801	48.0303
1.3962	3.4	18000	2.7682	35.8607	12.9747	23.7071	30.0202	17.573	10.256	5.8021	49.8578
1.3699	3.59	19000	2.7530	36.1645	12.8211	23.5026	30.2944	17.6159	10.18	5.6959	51.3846
1.3552	3.78	20000	2.7558	36.1135	12.6383	23.1973	30.2234	17.3569	9.9499	5.577	50.2098
1.37	3.97	21000	2.7441	35.9377	12.744	23.3985	30.1982	17.5623	10.1743	5.8601	51.704
1.1739	4.15	22000	2.8335	36.126	12.8817	23.4948	30.259	17.6231	10.2709	5.772	51.5035
1.1966	4.34	23000	2.8219	36.3689	12.7938	23.7675	30.5862	17.6182	10.3642	6.0505	49.5664
1.1812	4.53	24000	2.8206	36.3009	13.2677	23.65	30.5531	18.0616	10.7975	6.3877	51.6783
1.1885	4.72	25000	2.8247	36.0696	13.0568	23.7406	30.4063	17.8602	10.7829	6.0939	50.4499
1.165	4.91	26000	2.8230	36.1016	12.8106	23.6405	30.2521	17.5293	10.3861	5.7474	50.6713

Framework versions

Transformers 4.25.1
Pytorch 1.13.0+cu117
Datasets 2.7.1
Tokenizers 0.13.2

nlp04
/

kobart_8_5.6e-5_min30_lp4_sample

kobart_8_5.6e-5_min30_lp4_sample

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results