Edit model card

bart_bertsum_1024_250_1000

This model is a fine-tuned version of facebook/bart-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0191
  • Rouge1: 0.6894
  • Rouge2: 0.4262
  • Rougel: 0.6274
  • Rougelsum: 0.6272
  • Wer: 0.4606
  • Bleurt: -0.5228

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 6
  • eval_batch_size: 6
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Wer Bleurt
No log 0.13 250 1.2589 0.6432 0.3644 0.5764 0.5763 0.5168 -0.3132
2.1861 0.27 500 1.1641 0.6562 0.3824 0.591 0.591 0.4985 -0.867
2.1861 0.4 750 1.1326 0.6626 0.3917 0.5988 0.5987 0.4904 -0.5078
1.2496 0.53 1000 1.1111 0.6657 0.3958 0.6015 0.6014 0.4859 -0.484
1.2496 0.66 1250 1.0959 0.6708 0.4014 0.6052 0.6051 0.4814 -0.4774
1.193 0.8 1500 1.0774 0.6724 0.4041 0.609 0.609 0.4787 -0.494
1.193 0.93 1750 1.0662 0.681 0.4127 0.6177 0.6176 0.4742 -0.4464
1.14 1.06 2000 1.0593 0.6795 0.4157 0.6178 0.6177 0.4709 -0.5849
1.14 1.2 2250 1.0504 0.6784 0.4158 0.6161 0.616 0.4685 -0.3624
1.0439 1.33 2500 1.0427 0.6815 0.418 0.6196 0.6195 0.4667 -0.5998
1.0439 1.46 2750 1.0357 0.6833 0.4198 0.6209 0.6207 0.465 -0.6198
1.045 1.6 3000 1.0286 0.6872 0.4238 0.6251 0.6251 0.4635 -0.4564
1.045 1.73 3250 1.0248 0.6829 0.4214 0.6222 0.6221 0.4622 -0.5228
1.0242 1.86 3500 1.0198 0.69 0.4273 0.6284 0.6283 0.4601 -0.4592
1.0242 1.99 3750 1.0191 0.6894 0.4262 0.6274 0.6272 0.4606 -0.5228

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
1
Safetensors
Model size
406M params
Tensor type
F32
·

Finetuned from