Edit model card

distill-pegasus-cnn-arxiv-pubmed-v3-e8

This model is a fine-tuned version of theojolliffe/distill-pegasus-cnn-arxiv-pubmed on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6844
  • Rouge1: 49.0081
  • Rouge2: 30.6784
  • Rougel: 33.5258
  • Rougelsum: 45.5354
  • Gen Len: 125.6852

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.7633 1.0 795 2.1211 48.9615 30.3509 33.7359 44.508 124.7963
2.3051 2.0 1590 1.9464 48.6806 30.452 34.2187 44.6379 124.6296
2.2244 3.0 2385 1.8294 48.9739 30.6717 33.605 45.0942 125.3704
2.0733 4.0 3180 1.7769 49.0049 30.8354 33.6965 44.8603 125.7037
1.9759 5.0 3975 1.7192 50.3946 32.1072 34.5453 46.4493 125.5741
1.9478 6.0 4770 1.7037 49.4631 31.654 34.4601 46.2376 125.5185
1.9016 7.0 5565 1.6874 48.2641 29.6354 33.1059 44.8436 125.6852
1.8882 8.0 6360 1.6844 49.0081 30.6784 33.5258 45.5354 125.6852

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
4