BART1 / README.md
MarPla's picture
End of training
03ef218 verified
metadata
license: mit
base_model: facebook/bart-large-cnn
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: BART1
    results: []

BART1

This model is a fine-tuned version of facebook/bart-large-cnn on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.8706
  • Rouge1: 57.2472
  • Rouge2: 23.1787
  • Rougel: 41.8726
  • Rougelsum: 53.8183
  • Gen Len: 234.4232

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
5.8303 0.0835 100 5.6762 48.0404 16.526 33.0315 45.2714 234.4232
5.2419 0.1671 200 5.1330 49.5121 17.8978 34.5708 46.291 234.4232
5.0085 0.2506 300 4.8037 52.3507 19.2179 36.3445 48.7473 234.4232
4.676 0.3342 400 4.5745 51.4939 19.2534 37.2441 48.7288 234.4232
4.4521 0.4177 500 4.4154 52.9389 20.2028 38.4139 49.9981 234.4232
4.4572 0.5013 600 4.2389 54.6029 21.0796 39.2355 51.1397 234.4232
4.2836 0.5848 700 4.1267 55.5174 22.1184 40.2744 52.0886 234.4232
4.2862 0.6684 800 4.0549 56.305 22.433 40.8636 52.6987 234.4232
4.0806 0.7519 900 3.9673 57.3033 22.873 41.2543 53.5936 234.4232
4.0806 0.8355 1000 3.9154 56.3519 22.7588 41.4512 52.9385 234.4232
3.8885 0.9190 1100 3.8706 57.2472 23.1787 41.8726 53.8183 234.4232

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1