flan-t5-rouge-durga-q5-clean-4d
This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0519
- Rouge1: 0.5221
- Rouge2: 0.4278
- Rougel: 0.5213
- Rougelsum: 0.5204
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 60
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
2.4357 | 1.0 | 9 | 1.9785 | 0.2586 | 0.0742 | 0.2539 | 0.2539 |
2.6395 | 2.0 | 18 | 1.6948 | 0.2578 | 0.0708 | 0.2527 | 0.2525 |
1.7751 | 3.0 | 27 | 1.4660 | 0.2843 | 0.0833 | 0.2773 | 0.2777 |
2.0201 | 4.0 | 36 | 1.2841 | 0.3119 | 0.1080 | 0.3055 | 0.3068 |
1.9879 | 5.0 | 45 | 1.1375 | 0.3388 | 0.1313 | 0.3321 | 0.3333 |
1.6617 | 6.0 | 54 | 0.9940 | 0.3351 | 0.1264 | 0.3256 | 0.3259 |
1.5556 | 7.0 | 63 | 0.8861 | 0.3647 | 0.1620 | 0.3567 | 0.3569 |
1.2433 | 8.0 | 72 | 0.7889 | 0.3656 | 0.1716 | 0.3580 | 0.3579 |
1.252 | 9.0 | 81 | 0.6992 | 0.3651 | 0.1773 | 0.3563 | 0.3571 |
1.0389 | 10.0 | 90 | 0.6118 | 0.3777 | 0.1866 | 0.3699 | 0.3705 |
0.6633 | 11.0 | 99 | 0.5348 | 0.3646 | 0.1800 | 0.3589 | 0.3584 |
0.7738 | 12.0 | 108 | 0.4685 | 0.3909 | 0.2112 | 0.3844 | 0.3844 |
0.7849 | 13.0 | 117 | 0.4048 | 0.3843 | 0.2150 | 0.3766 | 0.3769 |
0.9278 | 14.0 | 126 | 0.3418 | 0.3973 | 0.2315 | 0.3915 | 0.3918 |
0.7269 | 15.0 | 135 | 0.3038 | 0.4066 | 0.2593 | 0.4001 | 0.4016 |
0.6558 | 16.0 | 144 | 0.2834 | 0.4323 | 0.2812 | 0.4289 | 0.4292 |
0.5569 | 17.0 | 153 | 0.2396 | 0.4287 | 0.2817 | 0.4219 | 0.4235 |
0.6052 | 18.0 | 162 | 0.2186 | 0.4382 | 0.2981 | 0.4323 | 0.4334 |
0.575 | 19.0 | 171 | 0.1989 | 0.4194 | 0.2784 | 0.4159 | 0.4162 |
0.5307 | 20.0 | 180 | 0.1722 | 0.4403 | 0.2978 | 0.4340 | 0.4357 |
0.4588 | 21.0 | 189 | 0.1643 | 0.4636 | 0.3195 | 0.4570 | 0.4580 |
0.3977 | 22.0 | 198 | 0.1431 | 0.4546 | 0.3234 | 0.4491 | 0.4504 |
0.4509 | 23.0 | 207 | 0.1388 | 0.4621 | 0.3336 | 0.4567 | 0.4571 |
0.3736 | 24.0 | 216 | 0.1277 | 0.4495 | 0.3262 | 0.4426 | 0.4438 |
0.3618 | 25.0 | 225 | 0.1198 | 0.4622 | 0.3424 | 0.4571 | 0.4585 |
0.3059 | 26.0 | 234 | 0.1090 | 0.4718 | 0.3475 | 0.4677 | 0.4678 |
0.2782 | 27.0 | 243 | 0.1039 | 0.4722 | 0.3512 | 0.4675 | 0.4677 |
0.2374 | 28.0 | 252 | 0.1006 | 0.4650 | 0.3408 | 0.4621 | 0.4625 |
0.228 | 29.0 | 261 | 0.0945 | 0.4818 | 0.3571 | 0.4778 | 0.4782 |
0.2778 | 30.0 | 270 | 0.0948 | 0.4732 | 0.3582 | 0.4710 | 0.4719 |
0.2601 | 31.0 | 279 | 0.0889 | 0.4822 | 0.3626 | 0.4791 | 0.4803 |
0.2364 | 32.0 | 288 | 0.0866 | 0.4863 | 0.3724 | 0.4851 | 0.4865 |
0.2124 | 33.0 | 297 | 0.0855 | 0.4841 | 0.3666 | 0.4829 | 0.4836 |
0.2004 | 34.0 | 306 | 0.0809 | 0.4835 | 0.3715 | 0.4819 | 0.4831 |
0.2095 | 35.0 | 315 | 0.0764 | 0.4797 | 0.3666 | 0.4778 | 0.4796 |
0.3603 | 36.0 | 324 | 0.0744 | 0.4934 | 0.3815 | 0.4924 | 0.4925 |
0.181 | 37.0 | 333 | 0.0718 | 0.4863 | 0.3754 | 0.4864 | 0.4866 |
0.1435 | 38.0 | 342 | 0.0687 | 0.4857 | 0.3778 | 0.4859 | 0.4861 |
0.1306 | 39.0 | 351 | 0.0676 | 0.4921 | 0.3826 | 0.4903 | 0.4907 |
0.1668 | 40.0 | 360 | 0.0667 | 0.4853 | 0.3784 | 0.4832 | 0.4845 |
0.2279 | 41.0 | 369 | 0.0647 | 0.4998 | 0.3950 | 0.4967 | 0.4978 |
0.2863 | 42.0 | 378 | 0.0638 | 0.5018 | 0.4022 | 0.4992 | 0.4997 |
0.1381 | 43.0 | 387 | 0.0631 | 0.5066 | 0.4085 | 0.5037 | 0.5041 |
0.1868 | 44.0 | 396 | 0.0611 | 0.5081 | 0.4068 | 0.5062 | 0.5061 |
0.1351 | 45.0 | 405 | 0.0614 | 0.5018 | 0.4001 | 0.5011 | 0.5010 |
0.1355 | 46.0 | 414 | 0.0604 | 0.5051 | 0.4027 | 0.5040 | 0.5045 |
0.108 | 47.0 | 423 | 0.0588 | 0.4983 | 0.3956 | 0.4982 | 0.4983 |
0.133 | 48.0 | 432 | 0.0573 | 0.5082 | 0.4069 | 0.5073 | 0.5075 |
0.2242 | 49.0 | 441 | 0.0565 | 0.5117 | 0.4114 | 0.5104 | 0.5104 |
0.1678 | 50.0 | 450 | 0.0548 | 0.5241 | 0.4272 | 0.5222 | 0.5225 |
0.1282 | 51.0 | 459 | 0.0543 | 0.5224 | 0.4263 | 0.5206 | 0.5212 |
0.15 | 52.0 | 468 | 0.0531 | 0.5171 | 0.4209 | 0.5161 | 0.5169 |
0.1356 | 53.0 | 477 | 0.0528 | 0.5164 | 0.4178 | 0.5159 | 0.5158 |
0.134 | 54.0 | 486 | 0.0527 | 0.5180 | 0.4228 | 0.5176 | 0.5178 |
0.1321 | 55.0 | 495 | 0.0529 | 0.5162 | 0.4192 | 0.5155 | 0.5162 |
0.1362 | 56.0 | 504 | 0.0526 | 0.5166 | 0.4206 | 0.5157 | 0.5156 |
0.1764 | 57.0 | 513 | 0.0524 | 0.5170 | 0.4215 | 0.5153 | 0.5163 |
0.1549 | 58.0 | 522 | 0.0522 | 0.5221 | 0.4278 | 0.5213 | 0.5204 |
0.1475 | 59.0 | 531 | 0.0520 | 0.5221 | 0.4278 | 0.5213 | 0.5204 |
0.1441 | 60.0 | 540 | 0.0519 | 0.5221 | 0.4278 | 0.5213 | 0.5204 |
Framework versions
- Transformers 4.46.0
- Pytorch 2.5.0+cu121
- Datasets 3.0.2
- Tokenizers 0.20.1
- Downloads last month
- 10
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for devagonal/flan-t5-rouge-durga-q5-clean-4d
Base model
google/flan-t5-base