metadata

language: en
license: mit
library_name: transformers
tags:
  - summarization
  - bart
datasets: ccdv/arxiv-summarization
model-index:
  - name: BARTxiv
    results:
      - task:
          type: summarization
        dataset:
          name: arxiv-summarization
          type: ccdv/arxiv-summarization
          split: validation
        metrics:
          - type: rouge1
            value: 41.70204016592095
          - type: rouge2
            value: 15.134827404979639

BARTxiv

See the model implementation here.

This model is a fine-tuned version of facebook/bart-large-cnn on the arxiv-summarization dataset. It achieves the following results on the validation set:

Loss: 0.86
Rouge1: 41.70
Rouge2: 15.13
Rougel: 22.85
Rougelsum: 37.77

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-6
train_batch_size: 1
eval_batch_size: 1
seed: 42
optimizer: Adafactor
num_epochs: 9

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
1.24	1.0	1073	1.24	38.32	12.80	20.55	34.50
1.04	2.0	2146	1.04	39.65	13.74	21.28	35.83
0.979	3.0	3219	0.98	40.19	14.30	21.87	36.38
0.970	4.0	4292	0.97	40.87	14.44	22.14	36.89
0.918	5.0	5365	0.92	41.17	14.94	22.54	37.40
0.901	6.0	6438	0.90	41.02	14.65	22.46	37.05
0.889	7.0	7511	0.89	41.32	15.09	22.64	37.42
0.900	8.0	8584	0 .90	41.23	15.02	22.67	37.28
0.869	9.0	9657	0.87	41.70	15.13	22.85	37.77

Framework versions

Transformers 4.25.1
Pytorch 1.13.0+cu117
Datasets 2.6.1
Tokenizers 0.13.1