Gabriel's picture
update model card README.md
a124fd9
metadata
license: mit
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: bart-base-cnn-xsum-swe
    results: []

bart-base-cnn-xsum-swe

This model is a fine-tuned version of Gabriel/bart-base-cnn-swe on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1895
  • Rouge1: 31.1693
  • Rouge2: 12.7388
  • Rougel: 25.7655
  • Rougelsum: 25.7862
  • Gen Len: 19.7733

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 8
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.3079 1.0 6375 2.1998 29.7845 11.125 24.3181 24.3562 19.7119
2.064 2.0 12750 2.1245 30.4641 11.7383 25.0254 25.0633 19.653
1.8647 3.0 19125 2.1005 30.8903 12.2265 25.3996 25.4252 19.7457
1.7098 4.0 25500 2.1073 31.1173 12.4124 25.6553 25.6913 19.7546
1.5761 5.0 31875 2.1227 30.9586 12.4907 25.5474 25.5745 19.7675
1.4618 6.0 38250 2.1484 31.115 12.6546 25.684 25.7151 19.7456
1.3643 7.0 44625 2.1705 31.2225 12.8069 25.7901 25.8154 19.7842
1.2944 8.0 51000 2.1895 31.1693 12.7388 25.7655 25.7862 19.7733

Framework versions

  • Transformers 4.22.1
  • Pytorch 1.12.1+cu113
  • Datasets 2.5.1
  • Tokenizers 0.12.1