kobart_8_6e-5_datav2_min30_lp5.0_temperature1.0

This model is a fine-tuned version of gogamza/kobart-base-v2 on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.8935
Rouge1: 35.9396
Rouge2: 12.7251
Rougel: 23.4072
Bleu1: 29.8836
Bleu2: 17.3868
Bleu3: 10.1034
Bleu4: 5.6852
Gen Len: 50.5012

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 6e-05
train_batch_size: 8
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 5.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Bleu1	Bleu2	Bleu3	Bleu4	Gen Len
2.5006	0.19	1000	2.9748	31.9305	10.219	20.9486	25.9772	14.0989	7.5807	3.9049	46.8951
2.3738	0.38	2000	2.8691	34.1196	11.4746	22.0999	28.4466	16.0082	8.9955	4.6276	52.7669
2.3468	0.57	3000	2.8207	34.1168	11.3998	22.5175	28.3223	15.791	8.5992	4.6269	43.3869
2.3217	0.76	4000	2.7748	33.0369	11.0712	22.1962	27.127	15.1147	8.3628	4.6229	43.7366
2.2252	0.94	5000	2.7395	34.4044	12.5602	23.0083	28.3603	16.6789	9.7892	5.6717	47.5828
1.9933	1.13	6000	2.7503	34.5083	11.7179	22.196	28.8115	16.4201	9.3595	4.9562	52.1865
1.963	1.32	7000	2.7527	33.7739	11.3831	22.3692	27.633	15.5257	8.7664	4.8824	45.3497
1.997	1.51	8000	2.7051	35.9943	12.9136	23.8678	30.0639	17.6209	10.5702	6.1691	46.5128
1.9855	1.7	9000	2.6832	34.1919	11.6503	22.7604	27.9586	15.8212	8.7798	4.906	45.3566
1.9522	1.89	10000	2.6502	35.5575	12.6492	23.1904	29.4797	17.1112	9.9781	5.7052	50.0559
1.6341	2.08	11000	2.7328	34.6455	11.8656	22.9323	28.484	16.09	9.0409	5.0875	44.0932
1.645	2.27	12000	2.7198	35.0304	12.3304	23.4026	28.7978	16.6707	9.6501	5.4396	45.3427
1.6333	2.45	13000	2.7258	35.6562	12.7612	23.3402	29.9319	17.4185	10.2105	5.6995	51.2727
1.6663	2.64	14000	2.7008	34.2188	11.7236	22.6835	28.2471	15.9416	9.0996	4.8797	45.1818
1.6786	2.83	15000	2.7106	35.3961	12.1801	23.1129	29.6386	17.0003	9.7356	5.3716	49.1958
1.3555	3.02	16000	2.8057	35.4698	12.4315	23.2317	29.5758	16.9988	9.8794	5.5261	49.8089
1.3975	3.21	17000	2.8155	35.7874	13.1167	24.1395	29.7118	17.4772	10.4028	5.8877	47.1608
1.3958	3.4	18000	2.8128	35.7796	12.7994	23.701	29.8194	17.3474	10.0427	5.3794	51.2005
1.3929	3.59	19000	2.8084	35.7019	12.8359	23.4838	29.8411	17.506	10.2791	5.6268	50.5897
1.4165	3.78	20000	2.8067	35.4685	12.3161	23.4552	29.8108	17.0718	9.636	5.4738	49.0769
1.399	3.97	21000	2.8022	36.0382	13.0705	23.8823	30.0459	17.5222	10.2384	5.7993	50.0979
1.1604	4.15	22000	2.9069	35.9586	12.9506	23.5262	30.2279	17.6621	10.4464	6.0544	53.4755
1.14	4.34	23000	2.9020	35.6245	12.2182	23.4536	29.8692	17.0002	9.7911	5.5078	49.5944
1.1943	4.53	24000	2.8960	35.9293	12.6219	23.4135	30.077	17.4198	10.1376	5.6971	53.9091
1.1582	4.72	25000	2.8975	35.7625	12.7562	23.3171	29.7443	17.4017	10.1272	5.5476	51.5618
1.1561	4.91	26000	2.8935	35.9396	12.7251	23.4072	29.8836	17.3868	10.1034	5.6852	50.5012

Framework versions

Transformers 4.25.1
Pytorch 1.13.0+cu117
Datasets 2.7.1
Tokenizers 0.13.2

nlp04
/

kobart_8_6e-5_datav2_min30_lp5.0_temperature1.0

kobart_8_6e-5_datav2_min30_lp5.0_temperature1.0

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results