librarian-bot's picture
Librarian Bot: Add base_model information to model
99fd884
metadata
license: mit
tags:
  - generated_from_trainer
datasets:
  - multi_news
metrics:
  - rouge
base_model: facebook/bart-large-cnn
model-index:
  - name: bart-large-cnn-finetuned-multi-news
    results:
      - task:
          type: text2text-generation
          name: Sequence-to-sequence Language Modeling
        dataset:
          name: multi_news
          type: multi_news
          args: default
        metrics:
          - type: rouge
            value: 42.0423
            name: Rouge1

bart-large-cnn-finetuned-multi-news

This model is a fine-tuned version of facebook/bart-large-cnn on the multi_news dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0950
  • Rouge1: 42.0423
  • Rouge2: 14.8812
  • Rougel: 23.3412
  • Rougelsum: 36.2613

Model description

bart-large-cnn fine tuned on sample of multi-news dataset

Intended uses & limitations

The intended use of the model is for downstream summarization tasks but it's limited to input text 1024 words. Any text longer than that would be truncated.

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
2.2037 1.0 750 2.0950 42.0423 14.8812 23.3412 36.2613

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.0.0
  • Tokenizers 0.11.6