kobart_8_5.6e-5_min30_lp4_sample
This model is a fine-tuned version of gogamza/kobart-base-v2 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.8230
- Rouge1: 36.1016
- Rouge2: 12.8106
- Rougel: 23.6405
- Bleu1: 30.2521
- Bleu2: 17.5293
- Bleu3: 10.3861
- Bleu4: 5.7474
- Gen Len: 50.6713
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 8
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Bleu1 | Bleu2 | Bleu3 | Bleu4 | Gen Len |
---|---|---|---|---|---|---|---|---|---|---|---|
2.527 | 0.19 | 1000 | 3.0014 | 31.7677 | 9.8681 | 20.6405 | 26.0107 | 13.8904 | 7.2892 | 3.6709 | 48.8228 |
2.4185 | 0.38 | 2000 | 2.8850 | 32.5931 | 10.727 | 21.3403 | 26.4666 | 14.6488 | 8.0738 | 4.2365 | 44.3497 |
2.3546 | 0.57 | 3000 | 2.8285 | 32.7686 | 11.0203 | 21.8204 | 26.9575 | 15.0974 | 8.365 | 4.6586 | 45.0956 |
2.2617 | 0.76 | 4000 | 2.7775 | 34.1375 | 12.1264 | 22.506 | 28.082 | 16.2606 | 9.486 | 5.3768 | 49.4872 |
2.2106 | 0.94 | 5000 | 2.7396 | 33.4733 | 11.2845 | 22.5126 | 27.3856 | 15.3472 | 8.567 | 4.6969 | 44.2401 |
2.0022 | 1.13 | 6000 | 2.7534 | 33.9237 | 11.84 | 22.5473 | 27.5719 | 15.8555 | 9.0337 | 5.1397 | 45.2611 |
1.9749 | 1.32 | 7000 | 2.7258 | 35.1741 | 12.4088 | 22.8272 | 29.4193 | 17.0056 | 9.9196 | 5.5038 | 50.3124 |
1.993 | 1.51 | 8000 | 2.7026 | 35.8572 | 13.2373 | 23.5429 | 30.1024 | 17.7802 | 10.6998 | 6.133 | 51.9953 |
1.9461 | 1.7 | 9000 | 2.6379 | 35.0541 | 12.4639 | 23.4095 | 28.7614 | 16.7411 | 9.5243 | 5.3422 | 45.7319 |
1.9159 | 1.89 | 10000 | 2.6071 | 35.3005 | 13.0834 | 23.5232 | 29.2371 | 17.3405 | 10.0603 | 5.9913 | 46.3846 |
1.6347 | 2.08 | 11000 | 2.6773 | 35.6737 | 12.7968 | 23.5884 | 30.0898 | 17.5699 | 10.0439 | 5.984 | 51.4755 |
1.6179 | 2.27 | 12000 | 2.6652 | 35.6258 | 13.0066 | 24.1646 | 29.4431 | 17.3774 | 10.4055 | 6.0368 | 47.2121 |
1.613 | 2.45 | 13000 | 2.6667 | 35.6093 | 12.3267 | 23.4513 | 29.6818 | 17.0819 | 9.7674 | 5.4192 | 48.1632 |
1.6642 | 2.64 | 14000 | 2.6516 | 36.1341 | 12.9256 | 23.6283 | 30.3579 | 17.689 | 10.3152 | 5.6037 | 47.9534 |
1.6432 | 2.83 | 15000 | 2.6498 | 37.3996 | 14.1165 | 24.4384 | 31.3868 | 18.8878 | 11.6758 | 7.0218 | 51.0769 |
1.371 | 3.02 | 16000 | 2.7315 | 36.2931 | 13.1544 | 23.6259 | 30.5586 | 17.9341 | 10.2612 | 5.6973 | 53.0606 |
1.374 | 3.21 | 17000 | 2.7438 | 36.2938 | 13.3253 | 23.8868 | 30.2665 | 17.9543 | 10.6402 | 6.1801 | 48.0303 |
1.3962 | 3.4 | 18000 | 2.7682 | 35.8607 | 12.9747 | 23.7071 | 30.0202 | 17.573 | 10.256 | 5.8021 | 49.8578 |
1.3699 | 3.59 | 19000 | 2.7530 | 36.1645 | 12.8211 | 23.5026 | 30.2944 | 17.6159 | 10.18 | 5.6959 | 51.3846 |
1.3552 | 3.78 | 20000 | 2.7558 | 36.1135 | 12.6383 | 23.1973 | 30.2234 | 17.3569 | 9.9499 | 5.577 | 50.2098 |
1.37 | 3.97 | 21000 | 2.7441 | 35.9377 | 12.744 | 23.3985 | 30.1982 | 17.5623 | 10.1743 | 5.8601 | 51.704 |
1.1739 | 4.15 | 22000 | 2.8335 | 36.126 | 12.8817 | 23.4948 | 30.259 | 17.6231 | 10.2709 | 5.772 | 51.5035 |
1.1966 | 4.34 | 23000 | 2.8219 | 36.3689 | 12.7938 | 23.7675 | 30.5862 | 17.6182 | 10.3642 | 6.0505 | 49.5664 |
1.1812 | 4.53 | 24000 | 2.8206 | 36.3009 | 13.2677 | 23.65 | 30.5531 | 18.0616 | 10.7975 | 6.3877 | 51.6783 |
1.1885 | 4.72 | 25000 | 2.8247 | 36.0696 | 13.0568 | 23.7406 | 30.4063 | 17.8602 | 10.7829 | 6.0939 | 50.4499 |
1.165 | 4.91 | 26000 | 2.8230 | 36.1016 | 12.8106 | 23.6405 | 30.2521 | 17.5293 | 10.3861 | 5.7474 | 50.6713 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.13.0+cu117
- Datasets 2.7.1
- Tokenizers 0.13.2
- Downloads last month
- 1
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.