Edit model card

bart-base-cnn-xsum-swe

This model is a fine-tuned version of Gabriel/bart-base-cnn-swe on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1027
  • Rouge1: 30.9467
  • Rouge2: 12.2589
  • Rougel: 25.4487
  • Rougelsum: 25.4792
  • Gen Len: 19.7379

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 4
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.3076 1.0 6375 2.1986 29.7041 10.9883 24.2149 24.2406 19.7193
2.0733 2.0 12750 2.1246 30.4521 11.8107 24.9519 24.9745 19.6592
1.8933 3.0 19125 2.0989 30.9407 12.2682 25.4135 25.4378 19.7195
1.777 4.0 25500 2.1027 30.9467 12.2589 25.4487 25.4792 19.7379

Framework versions

  • Transformers 4.22.2
  • Pytorch 1.12.1+cu113
  • Datasets 2.5.1
  • Tokenizers 0.12.1
Downloads last month
11

Dataset used to train Gabriel/bart-base-cnn-xsum-swe

Space using Gabriel/bart-base-cnn-xsum-swe 1

Evaluation results