metadata
license: mit
base_model: facebook/bart-large-cnn
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: finetuned_bart_large_custom
results: []
finetuned_bart_large_custom
This model is a fine-tuned version of facebook/bart-large-cnn on the None dataset. It achieves the following results on the evaluation set:
- Loss: 4.8324
- Rouge1: 39.9143
- Rouge2: 10.7144
- Rougel: 21.1537
- Rougelsum: 35.81
- Gen Len: 131.6667
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 16 | 4.3093 | 39.1367 | 9.9819 | 21.0796 | 35.3746 | 132.0741 |
No log | 2.0 | 32 | 4.2921 | 39.0619 | 9.8356 | 21.7437 | 35.6597 | 131.7037 |
No log | 3.0 | 48 | 4.3876 | 39.5314 | 10.337 | 21.0096 | 35.9973 | 131.2593 |
No log | 4.0 | 64 | 4.4020 | 39.3551 | 9.9689 | 21.4343 | 35.3958 | 131.1481 |
No log | 5.0 | 80 | 4.3744 | 39.7603 | 10.4124 | 21.6535 | 35.4996 | 132.963 |
No log | 6.0 | 96 | 4.4821 | 39.9859 | 11.0712 | 22.2449 | 35.7868 | 132.4074 |
No log | 7.0 | 112 | 4.6017 | 38.765 | 10.3317 | 20.9319 | 34.6675 | 132.2593 |
No log | 8.0 | 128 | 4.4419 | 39.9964 | 10.3341 | 20.9618 | 35.8621 | 130.2222 |
No log | 9.0 | 144 | 4.4990 | 39.8075 | 10.3829 | 21.3509 | 35.9882 | 128.7407 |
No log | 10.0 | 160 | 4.7017 | 38.6152 | 9.9282 | 20.4588 | 34.4487 | 131.9259 |
No log | 11.0 | 176 | 4.5497 | 39.0296 | 9.9429 | 20.8087 | 34.4624 | 132.6296 |
No log | 12.0 | 192 | 4.7301 | 38.8819 | 9.5937 | 20.929 | 34.7983 | 131.4444 |
No log | 13.0 | 208 | 4.5114 | 38.4163 | 9.6869 | 20.373 | 34.1491 | 123.8519 |
No log | 14.0 | 224 | 4.7097 | 38.4294 | 9.5615 | 20.1514 | 35.0332 | 131.7407 |
No log | 15.0 | 240 | 4.6300 | 38.9564 | 9.6386 | 20.0618 | 34.8298 | 129.963 |
No log | 16.0 | 256 | 4.6916 | 38.5582 | 10.136 | 20.8347 | 34.4795 | 129.8519 |
No log | 17.0 | 272 | 4.6959 | 38.3264 | 9.5281 | 20.5576 | 34.6148 | 128.2963 |
No log | 18.0 | 288 | 4.6756 | 37.5569 | 9.123 | 19.8291 | 33.5111 | 126.6667 |
No log | 19.0 | 304 | 4.7579 | 38.5704 | 9.3654 | 20.1826 | 34.8297 | 131.4815 |
No log | 20.0 | 320 | 4.8128 | 40.158 | 10.3889 | 20.9267 | 36.8965 | 130.1852 |
No log | 21.0 | 336 | 4.7659 | 39.4144 | 10.2445 | 20.4763 | 35.328 | 134.2593 |
No log | 22.0 | 352 | 4.7983 | 40.2859 | 11.0388 | 21.1643 | 36.0311 | 131.9259 |
No log | 23.0 | 368 | 4.7954 | 39.2676 | 10.5795 | 21.1116 | 35.3949 | 130.1481 |
No log | 24.0 | 384 | 4.7991 | 39.8126 | 10.3955 | 21.2952 | 35.7538 | 130.5926 |
No log | 25.0 | 400 | 4.8371 | 39.3481 | 10.2857 | 20.9862 | 35.1724 | 125.1481 |
No log | 26.0 | 416 | 4.8589 | 40.0988 | 10.4426 | 21.7284 | 35.7289 | 130.3333 |
No log | 27.0 | 432 | 4.8423 | 39.9233 | 10.3253 | 21.5853 | 36.1194 | 131.1111 |
No log | 28.0 | 448 | 4.8274 | 40.0388 | 10.1713 | 20.991 | 35.3966 | 130.4444 |
No log | 29.0 | 464 | 4.8313 | 39.8516 | 10.6207 | 21.0394 | 35.6627 | 130.8148 |
No log | 30.0 | 480 | 4.8324 | 39.9143 | 10.7144 | 21.1537 | 35.81 | 131.6667 |
Framework versions
- Transformers 4.37.0
- Pytorch 2.1.2
- Datasets 2.1.0
- Tokenizers 0.15.1