Edit model card

distilbart-cnn-arxiv-pubmed-pubmed-v3-e8

This model is a fine-tuned version of theojolliffe/distilbart-cnn-arxiv-pubmed-pubmed on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8422
  • Rouge1: 54.9328
  • Rouge2: 36.7154
  • Rougel: 39.5674
  • Rougelsum: 52.4889
  • Gen Len: 142.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 398 1.1158 50.9754 30.9416 33.9908 48.4925 142.0
1.3585 2.0 796 0.9733 52.7954 33.8196 36.7836 50.4929 141.9259
0.8785 3.0 1194 0.9142 53.5548 35.3954 37.4787 51.1024 142.0
0.6485 4.0 1592 0.8666 52.6449 34.0018 37.5391 50.428 141.4074
0.6485 5.0 1990 0.8458 53.8913 35.4481 38.1552 51.3737 141.8889
0.4993 6.0 2388 0.8571 54.7333 36.8173 40.228 52.5574 141.9444
0.3957 7.0 2786 0.8455 54.9826 37.9674 40.5786 52.5968 141.9815
0.328 8.0 3184 0.8422 54.9328 36.7154 39.5674 52.4889 142.0

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.2.0
  • Tokenizers 0.12.1
Downloads last month
2