bart-cnn-pubmed-arxiv-pubmed-v3-e32

This model is a fine-tuned version of theojolliffe/bart-cnn-pubmed-arxiv-pubmed on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.9707
Rouge1: 58.6575
Rouge2: 47.1055
Rougel: 50.0715
Rougelsum: 57.58
Gen Len: 142.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 32
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
No log	1.0	398	0.9589	52.4374	32.0538	34.189	49.8178	142.0
1.1222	2.0	796	0.8144	54.363	35.2782	37.5982	51.9121	142.0
0.6686	3.0	1194	0.7747	53.3334	34.9112	38.1684	50.9676	142.0
0.4394	4.0	1592	0.7660	53.2391	34.1677	38.4917	50.582	142.0
0.4394	5.0	1990	0.7508	54.3922	36.631	39.6881	52.4238	142.0
0.2962	6.0	2388	0.8112	53.9595	36.1326	38.937	51.8107	142.0
0.201	7.0	2786	0.7842	55.3659	38.4021	41.1556	53.3145	142.0
0.1414	8.0	3184	0.7557	54.8476	38.7707	41.8756	53.3081	142.0
0.107	9.0	3582	0.8296	55.7594	39.3691	41.6456	53.9381	142.0
0.107	10.0	3980	0.8298	54.8163	38.9233	42.4104	52.9344	142.0
0.0838	11.0	4378	0.8492	56.3438	41.5532	44.6348	54.6106	141.8704
0.0637	12.0	4776	0.8619	56.8559	41.2682	43.4566	54.7799	142.0
0.051	13.0	5174	0.8733	57.4154	42.6009	44.401	56.0209	142.0
0.04	14.0	5572	0.8777	58.3095	44.7657	47.8527	56.7276	142.0
0.04	15.0	5970	0.8711	57.6542	43.1785	46.3796	56.0532	142.0
0.0341	16.0	6368	0.9038	57.7274	43.5198	45.8797	56.1525	142.0
0.0272	17.0	6766	0.8845	58.4461	44.9513	47.6616	57.0634	142.0
0.0231	18.0	7164	0.9108	58.5774	46.2637	49.9201	57.1939	141.963
0.018	19.0	7562	0.9059	58.7442	44.7141	47.6061	57.3604	142.0
0.018	20.0	7960	0.9133	57.2809	43.7722	46.2016	55.4421	142.0
0.0159	21.0	8358	0.9245	57.1685	44.5445	48.5015	55.9304	142.0
0.012	22.0	8756	0.9149	57.4727	44.2417	48.0224	56.1341	141.9444
0.0109	23.0	9154	0.9472	58.3537	45.2341	47.8222	56.8061	141.8148
0.0082	24.0	9552	0.9426	58.1553	45.6645	49.019	56.7908	142.0
0.0082	25.0	9950	0.9407	58.3571	46.0699	49.382	57.1456	142.0
0.0071	26.0	10348	0.9654	59.5689	47.2126	50.5317	58.2492	142.0
0.0057	27.0	10746	0.9651	58.2261	46.2797	49.8995	57.0725	142.0
0.0049	28.0	11144	0.9555	57.3502	44.2364	47.6214	55.69	142.0
0.0043	29.0	11542	0.9591	57.3909	44.5927	47.541	56.2071	142.0
0.0043	30.0	11940	0.9637	58.3275	46.1513	49.4288	57.073	142.0
0.0033	31.0	12338	0.9705	58.4669	46.613	49.5711	57.3531	142.0
0.0031	32.0	12736	0.9707	58.6575	47.1055	50.0715	57.58	142.0

Framework versions

Transformers 4.18.0
Pytorch 1.11.0+cu113
Datasets 2.1.0
Tokenizers 0.12.1

theojolliffe
/

bart-cnn-pubmed-arxiv-pubmed-v3-e32

bart-cnn-pubmed-arxiv-pubmed-v3-e32

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results