metadata

license: mit
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: bart-cnn-pubmed-arxiv-pubmed-arxiv-earlystopping
    results: []

bart-cnn-pubmed-arxiv-pubmed-arxiv-earlystopping

This model is a fine-tuned version of theojolliffe/bart-cnn-pubmed-arxiv-pubmed-arxiv on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.8794
Rouge1: 55.9136
Rouge2: 40.6124
Rougel: 43.8806
Rougelsum: 54.2039
Gen Len: 142.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
No log	0.31	125	1.2057	50.9436	30.6436	32.6348	48.0674	141.3519
No log	0.63	250	1.0933	52.0677	31.2561	32.8008	49.0282	141.9815
No log	0.94	375	0.9685	51.6623	32.148	34.0536	48.9779	141.5556
1.1594	1.26	500	0.9725	50.4646	30.6781	32.1995	47.3852	142.0
1.1594	1.57	625	0.9342	52.2146	32.2166	33.7256	49.2233	142.0
1.1594	1.88	750	0.8715	52.2443	33.66	36.0575	49.7769	141.6481
1.1594	2.2	875	0.8334	53.0976	33.9638	36.0616	50.7417	141.8889
0.6845	2.51	1000	0.8241	52.3152	32.8571	35.302	49.6273	142.0
0.6845	2.83	1125	0.7986	54.075	35.0318	37.4544	51.4955	142.0
0.6845	3.14	1250	0.8532	52.1242	32.5844	34.6821	49.6048	141.7037
0.6845	3.45	1375	0.8319	52.0714	32.8862	35.3255	49.3984	141.7593
0.4488	3.77	1500	0.8033	53.2189	34.7029	37.5627	50.8068	142.0
0.4488	4.08	1625	0.8322	53.1666	34.8916	37.733	50.9602	142.0
0.4488	4.4	1750	0.7985	51.8809	32.9926	36.3812	49.6592	142.0
0.4488	4.71	1875	0.8049	54.2959	36.648	39.2174	52.2153	141.8148
0.3017	5.03	2000	0.8148	53.1564	35.2561	38.4413	50.9793	141.7778
0.3017	5.34	2125	0.8153	53.5528	35.217	37.9034	51.3596	141.0741
0.3017	5.65	2250	0.8009	52.4906	34.9253	37.9829	50.3951	141.6111
0.3017	5.97	2375	0.7509	54.3645	37.5095	40.5725	52.1743	142.0
0.2052	6.28	2500	0.8019	54.5817	36.5587	40.0273	52.5349	142.0
0.2052	6.6	2625	0.8176	55.3618	38.556	41.5709	53.5806	142.0
0.2052	6.91	2750	0.7956	55.5057	38.0122	40.8857	53.1755	141.9815
0.2052	7.22	2875	0.7966	54.4586	37.4821	40.7638	52.4144	142.0
0.1465	7.54	3000	0.8311	54.3973	37.1016	40.2977	52.3982	142.0
0.1465	7.85	3125	0.8227	53.9072	36.5277	39.0963	51.9937	141.8889
0.1465	8.17	3250	0.7947	54.7043	38.5848	41.2942	52.8724	142.0
0.1465	8.48	3375	0.7954	54.5769	37.8265	40.6915	52.6429	141.9444
0.115	8.79	3500	0.8433	54.7883	38.0489	41.414	52.3718	142.0
0.115	9.11	3625	0.8416	56.5204	41.3216	44.451	54.7371	142.0
0.115	9.42	3750	0.8164	55.2908	39.0328	41.5761	53.4643	142.0
0.115	9.74	3875	0.8363	55.2659	39.4302	42.1691	53.7407	141.8889
0.0912	10.05	4000	0.8850	55.7855	40.6168	43.1968	54.2718	142.0
0.0912	10.36	4125	0.8268	56.1701	40.7518	42.987	54.1229	141.9259
0.0912	10.68	4250	0.8564	55.4179	39.6097	42.3691	53.4582	141.8889
0.0912	10.99	4375	0.8557	56.1136	41.4924	45.8591	54.6113	141.6667
0.0707	11.31	4500	0.8432	55.0109	39.3858	42.0807	53.4629	142.0
0.0707	11.62	4625	0.8377	54.3239	37.7401	40.4619	52.4602	142.0
0.0707	11.93	4750	0.8794	55.9136	40.6124	43.8806	54.2039	142.0

Framework versions

Transformers 4.18.0
Pytorch 1.11.0+cu113
Datasets 2.2.1
Tokenizers 0.12.1