bart-cnn-pubmed-arxiv-v3-e16

This model is a fine-tuned version of theojolliffe/bart-cnn-pubmed-arxiv on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.9340
Rouge1: 57.6388
Rouge2: 44.834
Rougel: 47.5043
Rougelsum: 56.1122
Gen Len: 142.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 16
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.2407	1.0	795	0.9270	53.3842	33.8559	35.7393	50.6907	142.0
0.704	2.0	1590	0.8092	53.2159	35.0209	37.8641	50.9514	141.963
0.5277	3.0	2385	0.7588	52.7709	34.2453	36.6319	50.1137	142.0
0.3449	4.0	3180	0.7617	52.0249	34.5679	37.3669	49.7643	142.0
0.2668	5.0	3975	0.7575	54.3131	35.3985	38.9242	51.5667	142.0
0.1756	6.0	4770	0.8161	53.6214	36.4376	39.1745	51.3685	142.0
0.1326	7.0	5565	0.7848	55.7549	38.8517	42.0106	53.4243	142.0
0.1051	8.0	6360	0.7912	55.2709	39.952	42.7398	53.6479	142.0
0.0781	9.0	7155	0.8491	55.5698	40.0599	42.9521	53.6734	142.0
0.0685	10.0	7950	0.8684	55.1142	40.3136	43.699	53.5463	142.0
0.0494	11.0	8745	0.8886	57.7988	43.6659	46.0913	56.3383	142.0
0.0338	12.0	9540	0.8827	57.0166	42.7553	46.2344	55.2893	142.0
0.0296	13.0	10335	0.9111	56.7741	42.6116	45.1692	55.2065	142.0
0.0228	14.0	11130	0.9209	56.635	43.2461	46.314	55.049	142.0
0.0189	15.0	11925	0.9193	56.4404	43.4216	46.279	55.1403	142.0
0.0152	16.0	12720	0.9340	57.6388	44.834	47.5043	56.1122	142.0

Framework versions

Transformers 4.18.0
Pytorch 1.11.0+cu113
Datasets 2.1.0
Tokenizers 0.12.1

theojolliffe
/

bart-cnn-pubmed-arxiv-v3-e16

bart-cnn-pubmed-arxiv-v3-e16

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results