metadata
license: mit
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: bart-cnn-pubmed-arxiv-pubmed-arxiv-earlystopping
results: []
bart-cnn-pubmed-arxiv-pubmed-arxiv-earlystopping
This model is a fine-tuned version of theojolliffe/bart-cnn-pubmed-arxiv-pubmed-arxiv on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.8793
- Rouge1: 56.2055
- Rouge2: 41.9231
- Rougel: 45.0616
- Rougelsum: 54.6643
- Gen Len: 142.0
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1000
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 0.31 | 125 | 1.2057 | 50.9339 | 30.6777 | 32.6396 | 47.9592 | 141.3519 |
No log | 0.63 | 250 | 1.0933 | 52.0728 | 31.2361 | 32.8214 | 48.9776 | 141.9815 |
No log | 0.94 | 375 | 0.9685 | 51.6847 | 32.1578 | 34.1933 | 48.8808 | 141.5556 |
1.1594 | 1.26 | 500 | 0.9725 | 50.5131 | 30.6043 | 32.1861 | 47.4346 | 142.0 |
1.1594 | 1.57 | 625 | 0.9342 | 52.228 | 32.2073 | 33.797 | 49.2395 | 142.0 |
1.1594 | 1.88 | 750 | 0.8715 | 52.2 | 33.6602 | 36.1303 | 49.7138 | 141.6481 |
1.1594 | 2.2 | 875 | 0.8334 | 53.116 | 33.9871 | 35.9641 | 50.7658 | 141.8889 |
0.6845 | 2.51 | 1000 | 0.8241 | 52.2612 | 32.8025 | 35.27 | 49.5694 | 142.0 |
0.6845 | 2.83 | 1125 | 0.7986 | 54.1803 | 35.0019 | 37.4582 | 51.4577 | 142.0 |
0.6845 | 3.14 | 1250 | 0.8532 | 52.1328 | 32.6086 | 34.7455 | 49.6219 | 141.7037 |
0.6845 | 3.45 | 1375 | 0.8319 | 51.9614 | 32.8544 | 35.3269 | 49.3279 | 141.7593 |
0.4488 | 3.77 | 1500 | 0.8033 | 53.1404 | 34.6086 | 37.5482 | 50.7414 | 142.0 |
0.4488 | 4.08 | 1625 | 0.8322 | 53.1736 | 34.8662 | 37.7514 | 51.0601 | 142.0 |
0.4488 | 4.4 | 1750 | 0.7985 | 51.8251 | 32.9457 | 36.4164 | 49.55 | 142.0 |
0.4488 | 4.71 | 1875 | 0.8049 | 54.3423 | 36.6293 | 39.1316 | 52.2706 | 141.8148 |
0.3017 | 5.03 | 2000 | 0.8148 | 53.0698 | 35.2569 | 38.406 | 50.9346 | 141.7778 |
0.3017 | 5.34 | 2125 | 0.8153 | 53.4479 | 35.1525 | 37.8071 | 51.3731 | 141.0741 |
0.3017 | 5.65 | 2250 | 0.8009 | 52.5517 | 34.8287 | 37.999 | 50.2889 | 141.6111 |
0.3017 | 5.97 | 2375 | 0.7509 | 54.2725 | 37.4164 | 40.516 | 52.1379 | 142.0 |
0.2052 | 6.28 | 2500 | 0.8019 | 54.622 | 36.4776 | 39.9306 | 52.5069 | 142.0 |
0.2052 | 6.6 | 2625 | 0.8176 | 55.4796 | 38.4502 | 41.5523 | 53.5211 | 142.0 |
0.2052 | 6.91 | 2750 | 0.7956 | 55.4906 | 37.9064 | 40.845 | 53.107 | 141.9815 |
0.2052 | 7.22 | 2875 | 0.7966 | 54.5177 | 37.3399 | 40.7678 | 52.4241 | 142.0 |
0.1465 | 7.54 | 3000 | 0.8311 | 54.3473 | 37.0659 | 40.2507 | 52.372 | 142.0 |
0.1465 | 7.85 | 3125 | 0.8227 | 53.9245 | 36.4695 | 39.1205 | 51.9416 | 141.8889 |
0.1465 | 8.17 | 3250 | 0.7947 | 54.766 | 38.4275 | 41.2293 | 52.9075 | 142.0 |
0.1465 | 8.48 | 3375 | 0.7954 | 54.5305 | 37.6934 | 40.6804 | 52.5884 | 141.9444 |
0.115 | 8.79 | 3500 | 0.8433 | 54.7962 | 37.9373 | 41.3906 | 52.3778 | 142.0 |
0.115 | 9.11 | 3625 | 0.8416 | 56.59 | 41.2271 | 44.4207 | 54.7199 | 142.0 |
0.115 | 9.42 | 3750 | 0.8164 | 55.1903 | 39.0588 | 41.4908 | 53.4897 | 142.0 |
0.115 | 9.74 | 3875 | 0.8363 | 55.2894 | 39.3598 | 42.1138 | 53.831 | 141.8889 |
0.0912 | 10.05 | 4000 | 0.8850 | 55.7705 | 40.4924 | 43.1048 | 54.254 | 142.0 |
0.0912 | 10.36 | 4125 | 0.8268 | 56.1664 | 40.641 | 42.798 | 54.0001 | 141.9259 |
0.0912 | 10.68 | 4250 | 0.8564 | 55.4701 | 39.4949 | 42.2559 | 53.4486 | 141.8889 |
0.0912 | 10.99 | 4375 | 0.8557 | 56.0849 | 41.2861 | 45.8277 | 54.5999 | 141.6667 |
0.0707 | 11.31 | 4500 | 0.8432 | 54.9496 | 39.3006 | 42.0025 | 53.3854 | 142.0 |
0.0707 | 11.62 | 4625 | 0.8377 | 54.2438 | 37.6959 | 40.4637 | 52.3088 | 142.0 |
0.0707 | 11.93 | 4750 | 0.8794 | 55.9488 | 40.5401 | 43.7347 | 54.1282 | 142.0 |
0.0707 | 12.25 | 4875 | 0.8563 | 57.8762 | 43.366 | 46.6757 | 56.6985 | 142.0 |
0.0604 | 12.56 | 5000 | 0.8835 | 54.8926 | 39.3755 | 42.384 | 53.2687 | 141.6481 |
0.0604 | 12.88 | 5125 | 0.8570 | 55.6656 | 39.849 | 42.1455 | 54.352 | 142.0 |
0.0604 | 13.19 | 5250 | 0.8539 | 57.1549 | 41.901 | 45.153 | 55.213 | 142.0 |
0.0604 | 13.51 | 5375 | 0.8847 | 56.3279 | 40.9269 | 43.416 | 54.7242 | 142.0 |
0.051 | 13.82 | 5500 | 0.8795 | 56.8982 | 42.3333 | 45.2669 | 55.1034 | 142.0 |
0.051 | 14.13 | 5625 | 0.8751 | 55.3173 | 40.2853 | 43.2479 | 53.7236 | 142.0 |
0.051 | 14.45 | 5750 | 0.8799 | 56.1678 | 41.0862 | 43.8581 | 54.6316 | 142.0 |
0.051 | 14.76 | 5875 | 0.8678 | 57.3539 | 43.0473 | 44.8511 | 55.6474 | 142.0 |
0.0467 | 15.08 | 6000 | 0.8945 | 56.1939 | 41.985 | 45.0266 | 54.8139 | 142.0 |
0.0467 | 15.39 | 6125 | 0.9245 | 56.2071 | 41.5265 | 44.3228 | 54.5042 | 141.4074 |
0.0467 | 15.7 | 6250 | 0.8793 | 56.2055 | 41.9231 | 45.0616 | 54.6643 | 142.0 |
Framework versions
- Transformers 4.19.1
- Pytorch 1.11.0+cu113
- Datasets 2.2.1
- Tokenizers 0.12.1