fine-tuned-FLAN-T5-20-epochs-wanglab-512-output
This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:
- Loss: 6.0705
- Rouge1: 0.1508
- Rouge2: 0.0272
- Rougel: 0.1374
- Rougelsum: 0.1351
- Bertscore F1: 0.8553
- Bleurt Score: -1.2097
- Gen Len: 14.69
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bertscore F1 | Bleurt Score | Gen Len |
---|---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 301 | 11.0933 | 0.065 | 0.0148 | 0.0596 | 0.0595 | 0.7859 | -1.4402 | 18.92 |
20.9249 | 2.0 | 602 | 9.2324 | 0.0604 | 0.0154 | 0.0556 | 0.0554 | 0.7869 | -1.3807 | 17.42 |
20.9249 | 3.0 | 903 | 7.6254 | 0.0681 | 0.0192 | 0.0632 | 0.0627 | 0.7978 | -1.4375 | 18.42 |
11.3584 | 4.0 | 1204 | 6.7112 | 0.0614 | 0.0073 | 0.0578 | 0.0582 | 0.8076 | -1.3157 | 14.34 |
8.9106 | 5.0 | 1505 | 6.6742 | 0.0701 | 0.0204 | 0.0638 | 0.0635 | 0.7968 | -1.3894 | 17.29 |
8.9106 | 6.0 | 1806 | 5.9658 | 0.0836 | 0.0145 | 0.074 | 0.0742 | 0.818 | -1.3081 | 13.76 |
7.8674 | 7.0 | 2107 | 5.7095 | 0.113 | 0.025 | 0.1061 | 0.1078 | 0.8433 | -1.4119 | 13.71 |
7.8674 | 8.0 | 2408 | 5.6269 | 0.0987 | 0.0147 | 0.0933 | 0.0939 | 0.8201 | -1.2529 | 15.32 |
6.7786 | 9.0 | 2709 | 5.5192 | 0.1133 | 0.0203 | 0.1038 | 0.1051 | 0.8484 | -1.3751 | 13.75 |
6.3646 | 10.0 | 3010 | 5.4626 | 0.1347 | 0.0276 | 0.122 | 0.1236 | 0.8501 | -1.278 | 13.16 |
6.3646 | 11.0 | 3311 | 5.4467 | 0.103 | 0.0172 | 0.0951 | 0.0943 | 0.8263 | -1.3587 | 15.48 |
5.6998 | 12.0 | 3612 | 5.4587 | 0.126 | 0.0326 | 0.1191 | 0.1183 | 0.8474 | -1.2782 | 15.86 |
5.6998 | 13.0 | 3913 | 5.4846 | 0.1523 | 0.0325 | 0.1407 | 0.1408 | 0.8528 | -1.2406 | 14.82 |
5.2971 | 14.0 | 4214 | 5.6166 | 0.1363 | 0.0275 | 0.1279 | 0.1247 | 0.8512 | -1.2827 | 14.7 |
4.9391 | 15.0 | 4515 | 5.6821 | 0.1479 | 0.0238 | 0.136 | 0.1342 | 0.8545 | -1.2217 | 14.72 |
4.9391 | 16.0 | 4816 | 5.7849 | 0.1577 | 0.0307 | 0.1455 | 0.1445 | 0.8566 | -1.1756 | 15.25 |
4.6035 | 17.0 | 5117 | 5.8945 | 0.1313 | 0.0234 | 0.1214 | 0.1199 | 0.8525 | -1.2609 | 14.67 |
4.6035 | 18.0 | 5418 | 5.9956 | 0.1506 | 0.0315 | 0.1367 | 0.1348 | 0.8542 | -1.2107 | 14.61 |
4.3893 | 19.0 | 5719 | 6.0337 | 0.1449 | 0.0294 | 0.1337 | 0.1317 | 0.8553 | -1.2173 | 14.49 |
4.245 | 20.0 | 6020 | 6.0705 | 0.1508 | 0.0272 | 0.1374 | 0.1351 | 0.8553 | -1.2097 | 14.69 |
Framework versions
- Transformers 4.37.2
- Pytorch 2.1.0+cu121
- Datasets 2.17.1
- Tokenizers 0.15.2
- Downloads last month
- 4