oMateos2020's picture
update model card README.md
e12758d
metadata
tags:
  - generated_from_trainer
datasets:
  - cnn_dailymail
metrics:
  - rouge
model-index:
  - name: pegasus-newsroom-cnn_full-adafactor-bs6
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: cnn_dailymail
          type: cnn_dailymail
          args: 3.0.0
        metrics:
          - name: Rouge1
            type: rouge
            value: 44.1026

pegasus-newsroom-cnn_full-adafactor-bs6

This model is a fine-tuned version of oMateos2020/pegasus-newsroom-cnn_full-adafactor-bs6 on the cnn_dailymail dataset. It achieves the following results on the evaluation set:

  • Loss: 2.8671
  • Rouge1: 44.1026
  • Rouge2: 21.4261
  • Rougel: 31.2033
  • Rougelsum: 41.0324
  • Gen Len: 72.0839

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6.4e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 1
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.9343 0.5 560 2.8733 44.1226 21.4087 31.2431 41.0683 69.367
2.9855 1.0 1120 2.8671 44.1026 21.4261 31.2033 41.0324 72.0839

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.12.0+cu113
  • Datasets 2.4.0
  • Tokenizers 0.12.1