BARTxiv / README.md
Justin Du
Update README.md
8236ee0
|
raw
history blame
2.51 kB
metadata
language: en
license: mit
library_name: transformers
tags:
  - summarization
  - bart
datasets: ccdv/arxiv-summarization
model-index:
  - name: BARTxiv
    results:
      - task:
          type: summarization
        dataset:
          name: arxiv-summarization
          type: ccdv/arxiv-summarization
          split: validation
        metrics:
          - type: rouge1
            value: 41.70204016592095
          - type: rouge2
            value: 15.134827404979639

BARTxiv

This model is a fine-tuned version of facebook/bart-large-cnn on the arxiv-summarization dataset. It achieves the following results on the validation set:

  • Loss: 0.86
  • Rouge1: 41.70
  • Rouge2: 15.13
  • Rougel: 22.85
  • Rougelsum: 37.77

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-6
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adafactor
  • num_epochs: 9

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
1.24 1.0 1073 7.8495 38.32 12.80 20.55 34.50
1.04 2.0 2146 5.8341 39.65 13.74 21.28 35.83
0.979 3.0 3219 4.0960 40.19 14.30 21.87 36.38
0.970 4.0 4292 3.5401 40.87 14.44 22.14 36.89
0.918 5.0 5365 3.5401 41.17 14.94 22.54 37.40
0.901 6.0 6438 3.5401 41.02 14.65 22.46 37.05
0.889 7.0 7511 3.5401 41.32 15.09 22.64 37.42
0.900 8.0 8584 3.5401 41.23 15.02 22.67 37.28
0.869 9.0 9657 3.5401 41.70 15.13 22.85 37.77

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0+cu117
  • Datasets 2.6.1
  • Tokenizers 0.13.1