Edit model card

bart-large-finetuned-xsum

This model is a fine-tuned version of facebook/bart-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7085
  • Rouge1: 93.7743
  • Rouge2: 90.9799
  • Rougel: 93.7951
  • Rougelsum: 93.7675
  • Gen Len: 10.7959

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 50 0.6634 83.4679 75.0163 83.4978 83.5205 9.5204
No log 2.0 100 0.7003 87.6834 82.1534 87.6691 87.6372 11.2041
No log 3.0 150 0.6851 92.341 89.3673 92.265 92.306 10.6633
No log 4.0 200 0.5687 82.5008 75.639 82.6478 82.485 9.1531
No log 5.0 250 1.1993 90.2087 86.3398 90.1494 90.073 11.4592
No log 6.0 300 0.5020 86.2842 81.3427 86.1805 86.0801 10.0408
No log 7.0 350 0.5845 88.6278 83.9881 88.4848 88.6153 9.8878
No log 8.0 400 0.6150 91.3071 87.7098 91.3283 91.311 10.4796
No log 9.0 450 0.5937 90.9829 85.4487 91.0795 91.0271 11.2755
0.2951 10.0 500 0.6871 91.0166 88.4471 90.9538 91.0866 10.2041
0.2951 11.0 550 0.6682 91.4535 87.1402 91.422 91.3889 10.8571
0.2951 12.0 600 0.6011 92.0081 87.9292 91.9871 91.9615 11.6531
0.2951 13.0 650 0.8260 92.3687 89.0047 92.4395 92.4088 10.6224
0.2951 14.0 700 0.9396 91.7057 87.0141 91.7057 91.628 11.2245
0.2951 15.0 750 0.8138 91.1908 86.4812 91.1969 91.2138 11.602
0.2951 16.0 800 0.8685 93.3392 89.4402 93.341 93.3289 10.8061
0.2951 17.0 850 0.7764 91.5805 87.9478 91.5089 91.4414 11.551
0.2951 18.0 900 0.6408 88.2589 83.4929 88.2428 88.1257 9.8367
0.2951 19.0 950 0.6844 93.2318 90.7216 93.3116 93.2035 10.5306
0.1066 20.0 1000 0.7665 94.0825 91.5035 94.104 94.0729 10.8878
0.1066 21.0 1050 0.6803 93.8229 90.7038 93.886 93.7719 11.3469
0.1066 22.0 1100 0.8246 93.0925 89.8534 93.0948 93.0231 11.7857
0.1066 23.0 1150 0.7397 93.0087 89.9417 93.0176 92.9489 11.3878
0.1066 24.0 1200 0.7468 93.2956 90.0867 93.3264 93.2707 10.5816
0.1066 25.0 1250 0.7766 92.9672 89.7517 92.9915 92.9125 11.5816
0.1066 26.0 1300 0.7415 93.1965 89.9231 93.2259 93.1154 11.102
0.1066 27.0 1350 0.7283 93.2911 90.0648 93.348 93.3104 10.7245
0.1066 28.0 1400 0.7374 93.6969 90.4839 93.6888 93.6523 10.8163
0.1066 29.0 1450 0.6907 93.7121 90.8289 93.7581 93.6831 10.8571
0.0663 30.0 1500 0.7085 93.7743 90.9799 93.7951 93.7675 10.7959

Framework versions

  • Transformers 4.30.0
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.13.3
Downloads last month
4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.