Edit model card

bart-cnn-pubmed-arxiv-pubmed-v3-e8

This model is a fine-tuned version of theojolliffe/bart-cnn-pubmed-arxiv-pubmed on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7778
  • Rouge1: 55.6307
  • Rouge2: 38.1306
  • Rougel: 40.7127
  • Rougelsum: 53.3739
  • Gen Len: 141.9815

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 398 0.9563 53.0477 33.0365 35.4483 50.5525 142.0
1.1233 2.0 796 0.8260 53.8629 34.5031 37.08 51.129 142.0
0.6753 3.0 1194 0.7898 53.6508 34.7559 37.0541 50.7535 142.0
0.4532 4.0 1592 0.7765 53.2109 34.5657 37.3743 50.9145 142.0
0.4532 5.0 1990 0.7551 55.0766 37.5722 40.0653 52.5655 142.0
0.3142 6.0 2388 0.7744 54.7674 36.7664 39.9027 52.1542 142.0
0.2257 7.0 2786 0.7728 55.6258 37.9929 40.8985 53.4423 142.0
0.1674 8.0 3184 0.7778 55.6307 38.1306 40.7127 53.3739 141.9815

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
7