Edit model card

flan-t5-rouge-durga-q5-clean-4d

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0519
  • Rouge1: 0.5221
  • Rouge2: 0.4278
  • Rougel: 0.5213
  • Rougelsum: 0.5204

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 60

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
2.4357 1.0 9 1.9785 0.2586 0.0742 0.2539 0.2539
2.6395 2.0 18 1.6948 0.2578 0.0708 0.2527 0.2525
1.7751 3.0 27 1.4660 0.2843 0.0833 0.2773 0.2777
2.0201 4.0 36 1.2841 0.3119 0.1080 0.3055 0.3068
1.9879 5.0 45 1.1375 0.3388 0.1313 0.3321 0.3333
1.6617 6.0 54 0.9940 0.3351 0.1264 0.3256 0.3259
1.5556 7.0 63 0.8861 0.3647 0.1620 0.3567 0.3569
1.2433 8.0 72 0.7889 0.3656 0.1716 0.3580 0.3579
1.252 9.0 81 0.6992 0.3651 0.1773 0.3563 0.3571
1.0389 10.0 90 0.6118 0.3777 0.1866 0.3699 0.3705
0.6633 11.0 99 0.5348 0.3646 0.1800 0.3589 0.3584
0.7738 12.0 108 0.4685 0.3909 0.2112 0.3844 0.3844
0.7849 13.0 117 0.4048 0.3843 0.2150 0.3766 0.3769
0.9278 14.0 126 0.3418 0.3973 0.2315 0.3915 0.3918
0.7269 15.0 135 0.3038 0.4066 0.2593 0.4001 0.4016
0.6558 16.0 144 0.2834 0.4323 0.2812 0.4289 0.4292
0.5569 17.0 153 0.2396 0.4287 0.2817 0.4219 0.4235
0.6052 18.0 162 0.2186 0.4382 0.2981 0.4323 0.4334
0.575 19.0 171 0.1989 0.4194 0.2784 0.4159 0.4162
0.5307 20.0 180 0.1722 0.4403 0.2978 0.4340 0.4357
0.4588 21.0 189 0.1643 0.4636 0.3195 0.4570 0.4580
0.3977 22.0 198 0.1431 0.4546 0.3234 0.4491 0.4504
0.4509 23.0 207 0.1388 0.4621 0.3336 0.4567 0.4571
0.3736 24.0 216 0.1277 0.4495 0.3262 0.4426 0.4438
0.3618 25.0 225 0.1198 0.4622 0.3424 0.4571 0.4585
0.3059 26.0 234 0.1090 0.4718 0.3475 0.4677 0.4678
0.2782 27.0 243 0.1039 0.4722 0.3512 0.4675 0.4677
0.2374 28.0 252 0.1006 0.4650 0.3408 0.4621 0.4625
0.228 29.0 261 0.0945 0.4818 0.3571 0.4778 0.4782
0.2778 30.0 270 0.0948 0.4732 0.3582 0.4710 0.4719
0.2601 31.0 279 0.0889 0.4822 0.3626 0.4791 0.4803
0.2364 32.0 288 0.0866 0.4863 0.3724 0.4851 0.4865
0.2124 33.0 297 0.0855 0.4841 0.3666 0.4829 0.4836
0.2004 34.0 306 0.0809 0.4835 0.3715 0.4819 0.4831
0.2095 35.0 315 0.0764 0.4797 0.3666 0.4778 0.4796
0.3603 36.0 324 0.0744 0.4934 0.3815 0.4924 0.4925
0.181 37.0 333 0.0718 0.4863 0.3754 0.4864 0.4866
0.1435 38.0 342 0.0687 0.4857 0.3778 0.4859 0.4861
0.1306 39.0 351 0.0676 0.4921 0.3826 0.4903 0.4907
0.1668 40.0 360 0.0667 0.4853 0.3784 0.4832 0.4845
0.2279 41.0 369 0.0647 0.4998 0.3950 0.4967 0.4978
0.2863 42.0 378 0.0638 0.5018 0.4022 0.4992 0.4997
0.1381 43.0 387 0.0631 0.5066 0.4085 0.5037 0.5041
0.1868 44.0 396 0.0611 0.5081 0.4068 0.5062 0.5061
0.1351 45.0 405 0.0614 0.5018 0.4001 0.5011 0.5010
0.1355 46.0 414 0.0604 0.5051 0.4027 0.5040 0.5045
0.108 47.0 423 0.0588 0.4983 0.3956 0.4982 0.4983
0.133 48.0 432 0.0573 0.5082 0.4069 0.5073 0.5075
0.2242 49.0 441 0.0565 0.5117 0.4114 0.5104 0.5104
0.1678 50.0 450 0.0548 0.5241 0.4272 0.5222 0.5225
0.1282 51.0 459 0.0543 0.5224 0.4263 0.5206 0.5212
0.15 52.0 468 0.0531 0.5171 0.4209 0.5161 0.5169
0.1356 53.0 477 0.0528 0.5164 0.4178 0.5159 0.5158
0.134 54.0 486 0.0527 0.5180 0.4228 0.5176 0.5178
0.1321 55.0 495 0.0529 0.5162 0.4192 0.5155 0.5162
0.1362 56.0 504 0.0526 0.5166 0.4206 0.5157 0.5156
0.1764 57.0 513 0.0524 0.5170 0.4215 0.5153 0.5163
0.1549 58.0 522 0.0522 0.5221 0.4278 0.5213 0.5204
0.1475 59.0 531 0.0520 0.5221 0.4278 0.5213 0.5204
0.1441 60.0 540 0.0519 0.5221 0.4278 0.5213 0.5204

Framework versions

  • Transformers 4.46.0
  • Pytorch 2.5.0+cu121
  • Datasets 3.0.2
  • Tokenizers 0.20.1
Downloads last month
4
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for devagonal/flan-t5-rouge-durga-q5-clean-4d

Finetuned
(641)
this model

Space using devagonal/flan-t5-rouge-durga-q5-clean-4d 1