bart-cnn-pubmed-arxiv-pubmed-v3-e43

This model is a fine-tuned version of theojolliffe/bart-cnn-pubmed-arxiv-pubmed on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.0837
Rouge1: 58.1526
Rouge2: 46.0425
Rougel: 49.5624
Rougelsum: 56.9295
Gen Len: 142.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 43
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.2542	1.0	795	0.9354	51.4655	31.6464	34.2376	48.9765	141.963
0.7019	2.0	1590	0.8119	53.3066	34.683	36.4262	50.907	142.0
0.5251	3.0	2385	0.7839	52.4248	32.8685	36.0084	49.9957	142.0
0.3449	4.0	3180	0.7673	52.716	34.7869	38.4201	50.8384	142.0
0.2666	5.0	3975	0.7647	54.6433	37.1337	40.1459	52.4288	141.7778
0.1805	6.0	4770	0.8400	53.5747	36.001	39.5984	51.1935	141.8148
0.1413	7.0	5565	0.7925	53.9875	37.01	40.6532	51.9353	142.0
0.113	8.0	6360	0.7665	56.395	41.5764	44.327	54.7845	142.0
0.0907	9.0	7155	0.8442	55.1407	39.4113	43.0628	53.6503	142.0
0.0824	10.0	7950	0.8469	55.7103	40.6761	43.3754	53.8227	142.0
0.0639	11.0	8745	0.8892	56.0839	40.6204	43.2455	54.4412	142.0
0.0504	12.0	9540	0.8613	56.9634	42.8236	45.4255	55.4026	142.0
0.0447	13.0	10335	0.9341	57.7216	44.104	47.1429	56.4299	142.0
0.0396	14.0	11130	0.9203	56.2073	42.9575	45.8068	54.8089	142.0
0.036	15.0	11925	0.9253	58.5212	45.6047	49.1205	57.0551	142.0
0.0302	16.0	12720	0.9187	58.8046	46.0106	48.0442	57.2799	142.0
0.0261	17.0	13515	0.9578	57.3405	43.8227	46.6317	55.7836	142.0
0.0231	18.0	14310	0.9578	57.7604	44.6164	47.8902	56.2309	141.8148
0.0198	19.0	15105	0.9662	57.774	44.6407	47.5489	56.1936	142.0
0.0165	20.0	15900	0.9509	59.6297	46.5076	48.3507	58.083	142.0
0.0145	21.0	16695	0.9915	58.2245	45.1804	48.1191	56.889	142.0
0.0128	22.0	17490	0.9945	58.2646	46.2782	49.4411	56.992	142.0
0.0129	23.0	18285	1.0069	57.0055	44.1866	46.9101	55.5056	141.9444
0.0116	24.0	19080	0.9967	58.1091	45.5303	48.2208	56.4496	142.0
0.0093	25.0	19875	1.0188	56.59	43.677	45.8956	55.0954	142.0
0.008	26.0	20670	0.9976	58.5408	46.7019	48.9235	57.2562	142.0
0.0077	27.0	21465	1.0123	57.7909	45.7619	48.3412	56.3796	142.0
0.0075	28.0	22260	1.0258	58.1694	45.03	48.282	56.7303	142.0
0.0056	29.0	23055	1.0100	58.0406	45.37	48.0125	56.5288	142.0
0.0049	30.0	23850	1.0235	56.419	43.248	46.3448	54.8467	142.0
0.0042	31.0	24645	1.0395	57.7232	45.6305	48.4531	56.3343	141.9444
0.0034	32.0	25440	1.0605	58.9049	46.8049	49.9103	57.6751	141.5
0.0032	33.0	26235	1.0362	57.8681	45.9028	48.8624	56.5616	141.8704
0.0025	34.0	27030	1.0521	58.8985	46.8547	49.8485	57.4249	142.0
0.0021	35.0	27825	1.0639	58.9324	46.656	49.1907	57.4836	142.0
0.0023	36.0	28620	1.0624	58.5734	46.6774	49.6377	57.3825	142.0
0.0019	37.0	29415	1.0636	58.9899	46.8217	49.4829	57.8683	142.0
0.0018	38.0	30210	1.0640	58.793	46.7964	49.7845	57.6379	142.0
0.0013	39.0	31005	1.0692	57.7124	45.5948	49.0482	56.4246	142.0
0.0012	40.0	31800	1.0746	58.1789	46.458	49.547	57.1007	141.6296
0.0008	41.0	32595	1.0815	57.7392	45.6404	48.4845	56.6464	142.0
0.0009	42.0	33390	1.0853	58.317	46.2661	49.0466	57.0971	142.0
0.0005	43.0	34185	1.0837	58.1526	46.0425	49.5624	56.9295	142.0

Framework versions

Transformers 4.19.2
Pytorch 1.11.0+cu113
Datasets 2.2.2
Tokenizers 0.12.1

theojolliffe
/

bart-cnn-pubmed-arxiv-pubmed-v3-e43

bart-cnn-pubmed-arxiv-pubmed-v3-e43

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results