bart-large-finetuned-xsum
This model is a fine-tuned version of facebook/bart-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.7085
- Rouge1: 93.7743
- Rouge2: 90.9799
- Rougel: 93.7951
- Rougelsum: 93.7675
- Gen Len: 10.7959
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 50 | 0.6634 | 83.4679 | 75.0163 | 83.4978 | 83.5205 | 9.5204 |
No log | 2.0 | 100 | 0.7003 | 87.6834 | 82.1534 | 87.6691 | 87.6372 | 11.2041 |
No log | 3.0 | 150 | 0.6851 | 92.341 | 89.3673 | 92.265 | 92.306 | 10.6633 |
No log | 4.0 | 200 | 0.5687 | 82.5008 | 75.639 | 82.6478 | 82.485 | 9.1531 |
No log | 5.0 | 250 | 1.1993 | 90.2087 | 86.3398 | 90.1494 | 90.073 | 11.4592 |
No log | 6.0 | 300 | 0.5020 | 86.2842 | 81.3427 | 86.1805 | 86.0801 | 10.0408 |
No log | 7.0 | 350 | 0.5845 | 88.6278 | 83.9881 | 88.4848 | 88.6153 | 9.8878 |
No log | 8.0 | 400 | 0.6150 | 91.3071 | 87.7098 | 91.3283 | 91.311 | 10.4796 |
No log | 9.0 | 450 | 0.5937 | 90.9829 | 85.4487 | 91.0795 | 91.0271 | 11.2755 |
0.2951 | 10.0 | 500 | 0.6871 | 91.0166 | 88.4471 | 90.9538 | 91.0866 | 10.2041 |
0.2951 | 11.0 | 550 | 0.6682 | 91.4535 | 87.1402 | 91.422 | 91.3889 | 10.8571 |
0.2951 | 12.0 | 600 | 0.6011 | 92.0081 | 87.9292 | 91.9871 | 91.9615 | 11.6531 |
0.2951 | 13.0 | 650 | 0.8260 | 92.3687 | 89.0047 | 92.4395 | 92.4088 | 10.6224 |
0.2951 | 14.0 | 700 | 0.9396 | 91.7057 | 87.0141 | 91.7057 | 91.628 | 11.2245 |
0.2951 | 15.0 | 750 | 0.8138 | 91.1908 | 86.4812 | 91.1969 | 91.2138 | 11.602 |
0.2951 | 16.0 | 800 | 0.8685 | 93.3392 | 89.4402 | 93.341 | 93.3289 | 10.8061 |
0.2951 | 17.0 | 850 | 0.7764 | 91.5805 | 87.9478 | 91.5089 | 91.4414 | 11.551 |
0.2951 | 18.0 | 900 | 0.6408 | 88.2589 | 83.4929 | 88.2428 | 88.1257 | 9.8367 |
0.2951 | 19.0 | 950 | 0.6844 | 93.2318 | 90.7216 | 93.3116 | 93.2035 | 10.5306 |
0.1066 | 20.0 | 1000 | 0.7665 | 94.0825 | 91.5035 | 94.104 | 94.0729 | 10.8878 |
0.1066 | 21.0 | 1050 | 0.6803 | 93.8229 | 90.7038 | 93.886 | 93.7719 | 11.3469 |
0.1066 | 22.0 | 1100 | 0.8246 | 93.0925 | 89.8534 | 93.0948 | 93.0231 | 11.7857 |
0.1066 | 23.0 | 1150 | 0.7397 | 93.0087 | 89.9417 | 93.0176 | 92.9489 | 11.3878 |
0.1066 | 24.0 | 1200 | 0.7468 | 93.2956 | 90.0867 | 93.3264 | 93.2707 | 10.5816 |
0.1066 | 25.0 | 1250 | 0.7766 | 92.9672 | 89.7517 | 92.9915 | 92.9125 | 11.5816 |
0.1066 | 26.0 | 1300 | 0.7415 | 93.1965 | 89.9231 | 93.2259 | 93.1154 | 11.102 |
0.1066 | 27.0 | 1350 | 0.7283 | 93.2911 | 90.0648 | 93.348 | 93.3104 | 10.7245 |
0.1066 | 28.0 | 1400 | 0.7374 | 93.6969 | 90.4839 | 93.6888 | 93.6523 | 10.8163 |
0.1066 | 29.0 | 1450 | 0.6907 | 93.7121 | 90.8289 | 93.7581 | 93.6831 | 10.8571 |
0.0663 | 30.0 | 1500 | 0.7085 | 93.7743 | 90.9799 | 93.7951 | 93.7675 | 10.7959 |
Framework versions
- Transformers 4.30.0
- Pytorch 2.1.0+cu118
- Datasets 2.15.0
- Tokenizers 0.13.3
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet.
Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.