kobart_8_1e-4_datav2_min30_lp5.0_temperature1.0
This model is a fine-tuned version of gogamza/kobart-base-v2 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.0961
- Rouge1: 35.8883
- Rouge2: 12.7003
- Rougel: 23.3874
- Bleu1: 30.2528
- Bleu2: 17.5183
- Bleu3: 10.2094
- Bleu4: 5.6021
- Gen Len: 50.1562
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Bleu1 | Bleu2 | Bleu3 | Bleu4 | Gen Len |
---|---|---|---|---|---|---|---|---|---|---|---|
2.4648 | 0.19 | 1000 | 2.9491 | 32.241 | 10.5261 | 21.21 | 26.5995 | 14.7371 | 7.8411 | 4.1361 | 48.303 |
2.4028 | 0.38 | 2000 | 2.9226 | 33.8957 | 11.6309 | 22.4654 | 28.1592 | 15.9817 | 9.163 | 5.0564 | 49.5175 |
2.4109 | 0.57 | 3000 | 2.9092 | 33.9997 | 11.4619 | 22.2822 | 28.0021 | 15.7774 | 8.7258 | 4.5887 | 44.6807 |
2.3846 | 0.76 | 4000 | 2.8763 | 31.8881 | 10.1122 | 21.1754 | 25.4518 | 13.7126 | 7.4549 | 3.9979 | 40.9161 |
2.2972 | 0.94 | 5000 | 2.8441 | 33.4146 | 11.8371 | 22.7219 | 27.1678 | 15.4977 | 9.1783 | 5.3303 | 43.8765 |
2.0162 | 1.13 | 6000 | 2.8372 | 34.9461 | 11.8978 | 22.7877 | 28.9743 | 16.3778 | 9.2932 | 5.0534 | 47.1585 |
1.9816 | 1.32 | 7000 | 2.8630 | 33.1249 | 10.8834 | 22.0846 | 27.0042 | 14.9508 | 8.3482 | 4.5422 | 44.676 |
2.0172 | 1.51 | 8000 | 2.7998 | 34.1663 | 11.5471 | 22.8156 | 28.0367 | 15.7969 | 8.6235 | 4.5914 | 44.9254 |
2.017 | 1.7 | 9000 | 2.7865 | 33.3775 | 11.194 | 22.6083 | 26.7485 | 14.9797 | 8.2559 | 4.279 | 41.5828 |
1.9734 | 1.89 | 10000 | 2.7532 | 34.7147 | 12.353 | 23.0917 | 28.8012 | 16.7472 | 9.7079 | 5.5416 | 47.9883 |
1.5937 | 2.08 | 11000 | 2.8433 | 34.9402 | 12.2318 | 23.2483 | 28.8006 | 16.5212 | 9.6008 | 5.3947 | 45.2401 |
1.6112 | 2.27 | 12000 | 2.8377 | 34.9291 | 12.2349 | 23.278 | 28.8423 | 16.539 | 9.7674 | 5.4267 | 44.7599 |
1.603 | 2.45 | 13000 | 2.8223 | 35.3837 | 12.5491 | 23.5272 | 29.3683 | 16.9828 | 9.6955 | 5.3166 | 47.6037 |
1.6274 | 2.64 | 14000 | 2.8220 | 34.0515 | 11.7884 | 22.829 | 27.6635 | 15.8021 | 8.9724 | 4.9314 | 44.1235 |
1.6435 | 2.83 | 15000 | 2.8139 | 34.9239 | 12.2122 | 22.9939 | 29.1796 | 16.763 | 9.5513 | 5.174 | 46.7832 |
1.238 | 3.02 | 16000 | 2.9615 | 35.456 | 12.3012 | 23.3111 | 29.8676 | 17.0768 | 9.8694 | 5.4376 | 51.1935 |
1.2767 | 3.21 | 17000 | 2.9781 | 35.2632 | 12.1441 | 23.2537 | 29.1438 | 16.6216 | 9.353 | 5.1593 | 46.0793 |
1.2868 | 3.4 | 18000 | 2.9723 | 34.6808 | 11.9638 | 22.9058 | 28.9988 | 16.4994 | 9.3619 | 5.1178 | 47.4732 |
1.2842 | 3.59 | 19000 | 2.9688 | 35.3792 | 12.5174 | 23.2012 | 29.6403 | 17.1517 | 9.9507 | 5.5561 | 49.1515 |
1.2931 | 3.78 | 20000 | 2.9694 | 35.7525 | 12.8025 | 23.5228 | 29.8102 | 17.3544 | 10.239 | 5.6637 | 49.1189 |
1.2733 | 3.97 | 21000 | 2.9618 | 35.8931 | 12.627 | 23.5571 | 30.0482 | 17.2582 | 9.8412 | 5.4747 | 48.5082 |
0.963 | 4.15 | 22000 | 3.1113 | 35.7523 | 12.7633 | 23.3127 | 30.0193 | 17.4211 | 10.2596 | 5.853 | 51.6993 |
0.9563 | 4.34 | 23000 | 3.1031 | 35.8437 | 12.6323 | 23.6011 | 30.0923 | 17.4089 | 9.9831 | 5.5993 | 48.7646 |
0.992 | 4.53 | 24000 | 3.1016 | 36.1067 | 13.3428 | 24.0267 | 30.0275 | 17.8733 | 10.6929 | 6.2491 | 52.0373 |
0.9722 | 4.72 | 25000 | 3.0956 | 35.4406 | 12.4799 | 23.3418 | 29.5123 | 17.0292 | 9.7401 | 5.3586 | 48.8974 |
0.9519 | 4.91 | 26000 | 3.0961 | 35.8883 | 12.7003 | 23.3874 | 30.2528 | 17.5183 | 10.2094 | 5.6021 | 50.1562 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.13.0+cu117
- Datasets 2.7.1
- Tokenizers 0.13.2
- Downloads last month
- 1