keith97's picture
update model card README.md
4d72e76
metadata
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - multi_news
metrics:
  - rouge
model-index:
  - name: >-
      bert-small2bert-small-finetuned-cnn_daily_mail-summarization-finetuned-multi_news
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: multi_news
          type: multi_news
          args: default
        metrics:
          - name: Rouge1
            type: rouge
            value: 38.5318

bert-small2bert-small-finetuned-cnn_daily_mail-summarization-finetuned-multi_news

This model is a fine-tuned version of mrm8488/bert-small2bert-small-finetuned-cnn_daily_mail-summarization on the multi_news dataset. It achieves the following results on the evaluation set:

  • Loss: 4.3760
  • Rouge1: 38.5318
  • Rouge2: 12.7285
  • Rougel: 21.4358
  • Rougelsum: 33.4565
  • Gen Len: 128.985

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 5
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
4.6946 0.89 400 4.5393 37.164 11.5191 20.2519 32.1568 126.415
4.5128 1.78 800 4.4185 38.2345 12.2053 20.954 33.0667 128.975
4.2926 2.67 1200 4.3866 38.4475 12.6488 21.3046 33.2768 129.0
4.231 3.56 1600 4.3808 38.7008 12.6323 21.307 33.3693 128.955
4.125 4.44 2000 4.3760 38.5318 12.7285 21.4358 33.4565 128.985

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.11.0
  • Datasets 2.1.0
  • Tokenizers 0.12.1