Edit model card

distilbart-cnn-arxiv-pubmed-v3-e16

This model is a fine-tuned version of theojolliffe/distilbart-cnn-arxiv-pubmed on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8502
  • Rouge1: 57.1726
  • Rouge2: 42.87
  • Rougel: 44.7485
  • Rougelsum: 55.6955
  • Gen Len: 141.5926

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 16
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.4961 1.0 795 1.0907 53.2509 33.4232 34.4499 50.987 142.0
0.8874 2.0 1590 0.9408 52.9708 34.499 36.537 50.3924 140.4074
0.6994 3.0 2385 0.8731 53.4488 34.2476 37.4579 51.1979 142.0
0.4883 4.0 3180 0.8521 53.5463 34.7519 37.8143 51.106 142.0
0.3923 5.0 3975 0.8227 53.3556 35.0361 37.1719 50.9195 141.2222
0.2727 6.0 4770 0.8323 54.8422 37.333 39.6388 52.2975 141.8148
0.2158 7.0 5565 0.8252 54.0343 36.0109 38.34 51.6282 142.0
0.1734 8.0 6360 0.7985 54.9597 38.283 41.0033 52.9537 142.0
0.1366 9.0 7155 0.8112 56.315 40.3948 42.2944 54.3719 142.0
0.1275 10.0 7950 0.8238 55.8688 39.4747 43.0286 53.9269 142.0
0.0978 11.0 8745 0.8345 54.9934 40.0148 42.2721 53.324 142.0
0.0738 12.0 9540 0.8322 56.3862 41.4322 44.1406 54.4768 142.0
0.0688 13.0 10335 0.8384 55.9261 40.7102 43.5825 54.2394 142.0
0.0587 14.0 11130 0.8435 56.8475 41.7188 44.0671 54.9813 142.0
0.0529 15.0 11925 0.8476 57.4678 42.3804 45.4776 55.746 142.0
0.0469 16.0 12720 0.8502 57.1726 42.87 44.7485 55.6955 141.5926

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
3