bart-cnn-pubhealth-expanded

This model is a fine-tuned version of facebook/bart-large-cnn on the clupubhealth dataset. It achieves the following results on the evaluation set:

Loss: 2.7286
Rouge1: 28.3745
Rouge2: 8.806
Rougel: 19.3896
Rougelsum: 20.7149
Gen Len: 66.075

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
2.571	0.26	500	2.2030	29.8543	10.1926	20.7137	21.7285	66.6
2.313	0.51	1000	2.1891	29.5708	9.5292	20.0823	21.4907	66.87
2.1371	0.77	1500	2.1981	29.7651	9.4575	20.412	21.2983	65.925
1.9488	1.03	2000	2.3023	29.6158	9.4241	20.6193	21.5966	64.745
1.7406	1.29	2500	2.2808	30.0862	9.8179	20.5477	21.4372	65.17
1.6732	1.54	3000	2.2953	29.65	9.693	20.3996	21.1837	64.48
1.6349	1.8	3500	2.3093	29.9081	9.4101	20.2955	21.381	64.605
1.4981	2.06	4000	2.3376	29.3183	9.2161	20.4919	21.3562	64.73
1.3951	2.32	4500	2.3323	29.9405	9.118	19.9364	21.1458	66.425
1.3775	2.57	5000	2.3597	29.1785	8.7657	19.6031	20.6261	65.505
1.3426	2.83	5500	2.3744	29.1015	8.9953	20.0223	21.1623	64.99
1.2243	3.09	6000	2.4723	28.8329	8.8603	19.9412	21.0484	65.655
1.1798	3.35	6500	2.4063	28.9035	8.9915	19.8531	20.9957	65.93
1.1926	3.6	7000	2.4110	29.4024	8.8828	19.4321	20.763	65.9
1.1791	3.86	7500	2.4147	29.8599	9.168	20.2613	21.4986	65.205
1.0545	4.12	8000	2.4941	27.9696	8.1513	19.5133	20.2316	65.26
1.0513	4.37	8500	2.4345	28.8695	8.7627	19.8116	20.8412	64.375
1.0516	4.63	9000	2.4550	29.3524	9.1717	20.0134	21.1516	65.59
1.0454	4.89	9500	2.4543	29.0709	8.8377	19.9499	20.9215	66.055
0.9247	5.15	10000	2.5152	28.8769	8.7619	19.5535	20.5383	65.455
0.9529	5.4	10500	2.5192	29.4734	8.6629	19.6803	20.9521	66.855
0.953	5.66	11000	2.5530	28.7234	8.5991	19.235	20.3965	64.62
0.9519	5.92	11500	2.5024	28.8013	8.8198	19.091	20.2732	65.16
0.8492	6.18	12000	2.6300	28.8821	8.974	20.1383	21.1273	66.16
0.8705	6.43	12500	2.6192	28.9942	9.0923	20.0151	20.9462	66.17
0.8489	6.69	13000	2.5758	28.5162	8.7087	19.6472	20.6057	68.725
0.8853	6.95	13500	2.5783	29.0936	8.8353	19.8755	20.867	65.61
0.8043	7.21	14000	2.6668	28.198	8.5221	19.2404	20.4359	66.84
0.8004	7.46	14500	2.6676	28.4951	8.8535	19.8777	20.8867	65.99
0.8067	7.72	15000	2.6136	29.2442	8.8243	19.7428	20.9531	66.265
0.8008	7.98	15500	2.6362	28.9875	8.8529	19.6993	20.6463	65.83
0.7499	8.23	16000	2.6987	29.2742	9.0804	19.8464	21.0735	65.66
0.7556	8.49	16500	2.6859	28.5046	8.3465	19.0813	20.2561	65.31
0.7574	8.75	17000	2.7021	29.2861	8.8262	19.5899	20.9786	65.735
0.7524	9.01	17500	2.7160	29.1471	8.9296	20.0009	21.2013	66.415
0.7124	9.26	18000	2.7418	28.8323	8.7672	19.5686	20.5814	67.355
0.7084	9.52	18500	2.7267	28.3833	8.7165	19.0514	20.3386	67.075
0.7251	9.78	19000	2.7286	28.3745	8.806	19.3896	20.7149	66.075

Framework versions

Transformers 4.31.0
Pytorch 2.0.1+cu117
Datasets 2.7.1
Tokenizers 0.13.2

zwellington
/

bart-cnn-pubhealth-expanded

bart-cnn-pubhealth-expanded

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for zwellington/bart-cnn-pubhealth-expanded

Evaluation results