nikhedward's picture
Update README.md
6c150c0
metadata
license: mit
tags:
  - generated_from_trainer
datasets:
  - multi_news
metrics:
  - rouge
model-index:
  - name: bart-large-cnn-finetuned-multi-news
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: multi_news
          type: multi_news
          args: default
        metrics:
          - name: Rouge1
            type: rouge
            value: 42.0423

bart-large-cnn-finetuned-multi-news

This model is a fine-tuned version of facebook/bart-large-cnn on the multi_news dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0950
  • Rouge1: 42.0423
  • Rouge2: 14.8812
  • Rougel: 23.3412
  • Rougelsum: 36.2613

Model description

bart-large-cnn fine tuned on sample of multi-news dataset

Intended uses & limitations

The intended use of the model is for downstream summarization tasks but it's limited to input text 1024 words. Any text longer than that would be truncated.

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
2.2037 1.0 750 2.0950 42.0423 14.8812 23.3412 36.2613

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.0.0
  • Tokenizers 0.11.6