Edit model card

pegasus-newsroom-cnn_full-adafactor-bs6

This model is a fine-tuned version of oMateos2020/pegasus-newsroom-cnn_full-adafactor-bs6 on the cnn_dailymail dataset. It achieves the following results on the evaluation set:

  • Loss: 2.8671
  • Rouge1: 44.1026
  • Rouge2: 21.4261
  • Rougel: 31.2033
  • Rougelsum: 41.0324
  • Gen Len: 72.0839

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6.4e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 1
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.9343 0.5 560 2.8733 44.1226 21.4087 31.2431 41.0683 69.367
2.9855 1.0 1120 2.8671 44.1026 21.4261 31.2033 41.0324 72.0839

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.12.0+cu113
  • Datasets 2.4.0
  • Tokenizers 0.12.1
Downloads last month
8

Dataset used to train oMateos2020/pegasus-newsroom-cnn_full-adafactor-bs6

Evaluation results