Edit model card

bart-cnn-pubmed-arxiv-pubmed-v3-e16

This model is a fine-tuned version of theojolliffe/bart-cnn-pubmed-arxiv-pubmed on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8702
  • Rouge1: 56.1421
  • Rouge2: 41.3514
  • Rougel: 44.5146
  • Rougelsum: 54.3477
  • Gen Len: 142.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 16
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 398 0.9532 53.1932 32.9882 35.3852 50.6138 142.0
1.1219 2.0 796 0.8252 54.1306 35.3774 37.4334 51.6652 142.0
0.6698 3.0 1194 0.7828 53.8766 35.2945 39.2662 51.3239 142.0
0.4435 4.0 1592 0.7744 53.9029 35.2716 37.5502 51.1179 142.0
0.4435 5.0 1990 0.7644 53.8132 36.3643 39.9548 51.5348 141.4815
0.3001 6.0 2388 0.7996 53.7376 36.2289 39.063 51.7514 142.0
0.2045 7.0 2786 0.8009 54.4924 37.3594 40.033 52.1405 142.0
0.1416 8.0 3184 0.7578 55.2039 39.0907 42.171 53.2835 142.0
0.1058 9.0 3582 0.8030 54.6634 38.2708 42.232 52.6619 142.0
0.1058 10.0 3980 0.8057 53.8692 37.943 41.1825 51.7243 142.0
0.0803 11.0 4378 0.8182 56.5077 41.5916 44.1933 54.8699 142.0
0.0599 12.0 4776 0.8261 56.9709 42.1438 45.5351 55.0701 142.0
0.0458 13.0 5174 0.8469 56.5208 42.0329 44.4172 54.7958 142.0
0.0346 14.0 5572 0.8583 56.9187 42.4072 46.1096 55.3656 142.0
0.0346 15.0 5970 0.8653 56.503 42.047 45.8598 54.9676 141.8519
0.0293 16.0 6368 0.8702 56.1421 41.3514 44.5146 54.3477 142.0

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
2