Edit model card

bart-cnn-pubmed-arxiv-v3-e16

This model is a fine-tuned version of theojolliffe/bart-cnn-pubmed-arxiv on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9340
  • Rouge1: 57.6388
  • Rouge2: 44.834
  • Rougel: 47.5043
  • Rougelsum: 56.1122
  • Gen Len: 142.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 16
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.2407 1.0 795 0.9270 53.3842 33.8559 35.7393 50.6907 142.0
0.704 2.0 1590 0.8092 53.2159 35.0209 37.8641 50.9514 141.963
0.5277 3.0 2385 0.7588 52.7709 34.2453 36.6319 50.1137 142.0
0.3449 4.0 3180 0.7617 52.0249 34.5679 37.3669 49.7643 142.0
0.2668 5.0 3975 0.7575 54.3131 35.3985 38.9242 51.5667 142.0
0.1756 6.0 4770 0.8161 53.6214 36.4376 39.1745 51.3685 142.0
0.1326 7.0 5565 0.7848 55.7549 38.8517 42.0106 53.4243 142.0
0.1051 8.0 6360 0.7912 55.2709 39.952 42.7398 53.6479 142.0
0.0781 9.0 7155 0.8491 55.5698 40.0599 42.9521 53.6734 142.0
0.0685 10.0 7950 0.8684 55.1142 40.3136 43.699 53.5463 142.0
0.0494 11.0 8745 0.8886 57.7988 43.6659 46.0913 56.3383 142.0
0.0338 12.0 9540 0.8827 57.0166 42.7553 46.2344 55.2893 142.0
0.0296 13.0 10335 0.9111 56.7741 42.6116 45.1692 55.2065 142.0
0.0228 14.0 11130 0.9209 56.635 43.2461 46.314 55.049 142.0
0.0189 15.0 11925 0.9193 56.4404 43.4216 46.279 55.1403 142.0
0.0152 16.0 12720 0.9340 57.6388 44.834 47.5043 56.1122 142.0

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
8