Edit model card

distilbart-cnn-arxiv-pubmed-pubmed-earlystopping

This model is a fine-tuned version of theojolliffe/distilbart-cnn-arxiv-pubmed-pubmed on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8596
  • Rouge1: 53.4491
  • Rouge2: 35.0041
  • Rougel: 37.2742
  • Rougelsum: 50.9867
  • Gen Len: 142.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 0.31 125 1.3772 50.6084 30.8075 32.6113 47.883 142.0
No log 0.63 250 1.2423 52.1758 31.6326 32.9448 49.8089 141.6296
No log 0.94 375 1.1223 52.3494 32.3508 35.3638 49.6019 142.0
1.3557 1.26 500 1.1004 51.8935 32.8506 35.521 49.6249 142.0
1.3557 1.57 625 1.0600 50.8085 31.0397 34.2021 48.2264 141.5741
1.3557 1.88 750 0.9834 53.0701 34.0699 36.4029 51.043 142.0
1.3557 2.2 875 0.9554 53.4385 34.2976 36.8142 51.1262 141.9444
0.868 2.51 1000 0.9256 52.2123 32.7568 34.5883 49.8566 142.0
0.868 2.83 1125 0.8944 53.8062 34.6687 36.9645 51.5162 142.0
0.868 3.14 1250 0.9290 53.1356 34.1301 37.7713 50.762 141.9074
0.868 3.45 1375 0.9017 53.4455 35.0572 37.3033 50.9773 142.0
0.6252 3.77 1500 0.8519 53.9228 35.5575 38.9119 51.5202 142.0
0.6252 4.08 1625 0.8991 54.4223 36.3072 38.5771 51.9874 141.9074
0.6252 4.4 1750 0.8857 53.4105 35.348 37.5814 50.8842 142.0
0.6252 4.71 1875 0.8596 53.4491 35.0041 37.2742 50.9867 142.0

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.2.1
  • Tokenizers 0.12.1
Downloads last month
2