bart-large-finetuned-xsum

This model is a fine-tuned version of facebook/bart-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.7085
Rouge1: 93.7743
Rouge2: 90.9799
Rougel: 93.7951
Rougelsum: 93.7675
Gen Len: 10.7959

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
No log	1.0	50	0.6634	83.4679	75.0163	83.4978	83.5205	9.5204
No log	2.0	100	0.7003	87.6834	82.1534	87.6691	87.6372	11.2041
No log	3.0	150	0.6851	92.341	89.3673	92.265	92.306	10.6633
No log	4.0	200	0.5687	82.5008	75.639	82.6478	82.485	9.1531
No log	5.0	250	1.1993	90.2087	86.3398	90.1494	90.073	11.4592
No log	6.0	300	0.5020	86.2842	81.3427	86.1805	86.0801	10.0408
No log	7.0	350	0.5845	88.6278	83.9881	88.4848	88.6153	9.8878
No log	8.0	400	0.6150	91.3071	87.7098	91.3283	91.311	10.4796
No log	9.0	450	0.5937	90.9829	85.4487	91.0795	91.0271	11.2755
0.2951	10.0	500	0.6871	91.0166	88.4471	90.9538	91.0866	10.2041
0.2951	11.0	550	0.6682	91.4535	87.1402	91.422	91.3889	10.8571
0.2951	12.0	600	0.6011	92.0081	87.9292	91.9871	91.9615	11.6531
0.2951	13.0	650	0.8260	92.3687	89.0047	92.4395	92.4088	10.6224
0.2951	14.0	700	0.9396	91.7057	87.0141	91.7057	91.628	11.2245
0.2951	15.0	750	0.8138	91.1908	86.4812	91.1969	91.2138	11.602
0.2951	16.0	800	0.8685	93.3392	89.4402	93.341	93.3289	10.8061
0.2951	17.0	850	0.7764	91.5805	87.9478	91.5089	91.4414	11.551
0.2951	18.0	900	0.6408	88.2589	83.4929	88.2428	88.1257	9.8367
0.2951	19.0	950	0.6844	93.2318	90.7216	93.3116	93.2035	10.5306
0.1066	20.0	1000	0.7665	94.0825	91.5035	94.104	94.0729	10.8878
0.1066	21.0	1050	0.6803	93.8229	90.7038	93.886	93.7719	11.3469
0.1066	22.0	1100	0.8246	93.0925	89.8534	93.0948	93.0231	11.7857
0.1066	23.0	1150	0.7397	93.0087	89.9417	93.0176	92.9489	11.3878
0.1066	24.0	1200	0.7468	93.2956	90.0867	93.3264	93.2707	10.5816
0.1066	25.0	1250	0.7766	92.9672	89.7517	92.9915	92.9125	11.5816
0.1066	26.0	1300	0.7415	93.1965	89.9231	93.2259	93.1154	11.102
0.1066	27.0	1350	0.7283	93.2911	90.0648	93.348	93.3104	10.7245
0.1066	28.0	1400	0.7374	93.6969	90.4839	93.6888	93.6523	10.8163
0.1066	29.0	1450	0.6907	93.7121	90.8289	93.7581	93.6831	10.8571
0.0663	30.0	1500	0.7085	93.7743	90.9799	93.7951	93.7675	10.7959

Framework versions

Transformers 4.30.0
Pytorch 2.1.0+cu118
Datasets 2.15.0
Tokenizers 0.13.3

gman007
/

bart-large-finetuned-xsum

bart-large-finetuned-xsum

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results