t5-small-fine-tuned_model_4
This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.5774
- Rouge1: 36.746
- Rouge2: 27.845
- Rougel: 33.0926
- Rougelsum: 33.2212
- Gen Len: 1103.6667
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 200
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.9231 | 25 | 2.8693 | 12.3052 | 0.5791 | 11.7049 | 9.5975 | 60.6667 |
No log | 3.8462 | 50 | 2.4096 | 10.877 | 0.8174 | 10.3456 | 8.2477 | 60.5 |
No log | 5.7692 | 75 | 2.0937 | 9.5364 | 0.8174 | 8.9546 | 7.0331 | 77.5 |
No log | 7.6923 | 100 | 1.9212 | 9.4414 | 0.7372 | 8.7846 | 6.8375 | 77.5 |
No log | 9.6154 | 125 | 1.7628 | 9.4796 | 0.8174 | 8.9051 | 7.0161 | 77.8333 |
No log | 11.5385 | 150 | 1.6492 | 9.3248 | 0.7389 | 8.7894 | 6.8375 | 77.8333 |
No log | 13.4615 | 175 | 1.5335 | 7.5301 | 0.6238 | 7.2128 | 4.7712 | 399.6667 |
No log | 15.3846 | 200 | 1.4494 | 7.4893 | 0.6238 | 7.1749 | 5.0595 | 399.6667 |
No log | 17.3077 | 225 | 1.3791 | 3.6219 | 0.2407 | 3.4618 | 3.4618 | 1383.8333 |
No log | 19.2308 | 250 | 1.3344 | 2.3996 | 0.3987 | 1.9002 | 1.9002 | 1061.6667 |
No log | 21.1538 | 275 | 1.2727 | 4.0949 | 0.2788 | 3.5988 | 3.5988 | 1061.6667 |
No log | 23.0769 | 300 | 1.2295 | 4.4727 | 0.2404 | 4.0127 | 4.0344 | 739.5 |
No log | 25.0 | 325 | 1.1843 | 4.4115 | 0.1604 | 3.6418 | 2.8481 | 1080.5 |
No log | 26.9231 | 350 | 1.1238 | 4.3344 | 0.1606 | 3.7237 | 2.8672 | 1080.5 |
No log | 28.8462 | 375 | 1.0926 | 4.4134 | 0.1604 | 3.6827 | 2.8672 | 1080.5 |
No log | 30.7692 | 400 | 1.0685 | 4.4134 | 0.1604 | 3.6827 | 2.8672 | 1080.5 |
No log | 32.6923 | 425 | 1.0397 | 4.4996 | 0.2589 | 3.9347 | 2.9913 | 1080.5 |
No log | 34.6154 | 450 | 1.0090 | 5.0042 | 0.7844 | 4.4025 | 3.5377 | 1080.5 |
No log | 36.5385 | 475 | 0.9822 | 5.282 | 1.0387 | 4.7809 | 4.3066 | 1080.5 |
1.8477 | 38.4615 | 500 | 0.9604 | 5.1772 | 0.9156 | 4.3204 | 3.8574 | 1080.5 |
1.8477 | 40.3846 | 525 | 0.9432 | 5.1273 | 0.9511 | 4.6355 | 4.0124 | 1080.5 |
1.8477 | 42.3077 | 550 | 0.9267 | 4.8886 | 0.8802 | 4.3344 | 3.7113 | 1080.5 |
1.8477 | 44.2308 | 575 | 0.9095 | 5.1853 | 0.962 | 4.4102 | 4.669 | 1080.5 |
1.8477 | 46.1538 | 600 | 0.8909 | 5.4135 | 0.9609 | 4.4226 | 3.8166 | 1080.5 |
1.8477 | 48.0769 | 625 | 0.8768 | 5.2497 | 0.882 | 4.3238 | 3.7122 | 1079.0 |
1.8477 | 50.0 | 650 | 0.8723 | 5.0982 | 0.7553 | 4.3749 | 4.3948 | 1080.6667 |
1.8477 | 51.9231 | 675 | 0.8593 | 5.1037 | 0.7584 | 4.1406 | 4.1194 | 1080.5 |
1.8477 | 53.8462 | 700 | 0.8546 | 2.8445 | 0.6707 | 2.3994 | 2.3994 | 1080.5 |
1.8477 | 55.7692 | 725 | 0.8369 | 3.0259 | 0.8755 | 2.5487 | 2.5491 | 1080.5 |
1.8477 | 57.6923 | 750 | 0.8265 | 2.9364 | 0.8348 | 2.4559 | 2.4559 | 1080.5 |
1.8477 | 59.6154 | 775 | 0.8177 | 12.9616 | 5.7206 | 12.0543 | 12.3145 | 743.3333 |
1.8477 | 61.5385 | 800 | 0.8044 | 7.5774 | 3.6548 | 7.474 | 7.5189 | 1065.8333 |
1.8477 | 63.4615 | 825 | 0.7919 | 5.5517 | 1.0848 | 5.2334 | 4.1167 | 1058.6667 |
1.8477 | 65.3846 | 850 | 0.7907 | 15.4548 | 7.5458 | 13.4559 | 12.6255 | 518.6667 |
1.8477 | 67.3077 | 875 | 0.7854 | 12.3467 | 5.8196 | 11.7847 | 10.6211 | 490.6667 |
1.8477 | 69.2308 | 900 | 0.7807 | 19.1325 | 10.1098 | 17.0462 | 16.2033 | 179.0 |
1.8477 | 71.1538 | 925 | 0.7679 | 10.5941 | 4.8942 | 10.0005 | 8.6433 | 736.6667 |
1.8477 | 73.0769 | 950 | 0.7609 | 16.2234 | 10.4765 | 15.0 | 14.4657 | 499.6667 |
1.8477 | 75.0 | 975 | 0.7566 | 17.9094 | 12.6672 | 16.8839 | 16.2986 | 800.3333 |
1.067 | 76.9231 | 1000 | 0.7485 | 18.6028 | 12.7736 | 16.6168 | 16.1664 | 813.6667 |
1.067 | 78.8462 | 1025 | 0.7483 | 17.2067 | 12.1333 | 15.5985 | 15.2003 | 867.1667 |
1.067 | 80.7692 | 1050 | 0.7326 | 25.0325 | 19.4385 | 23.4785 | 23.4473 | 1113.6667 |
1.067 | 82.6923 | 1075 | 0.7222 | 24.0064 | 18.9078 | 22.4609 | 22.297 | 1113.6667 |
1.067 | 84.6154 | 1100 | 0.7171 | 29.6848 | 22.1186 | 27.4607 | 27.9158 | 1091.8333 |
1.067 | 86.5385 | 1125 | 0.7158 | 23.6259 | 17.0707 | 21.4673 | 20.8102 | 1386.5 |
1.067 | 88.4615 | 1150 | 0.7011 | 22.7916 | 17.2723 | 21.5803 | 20.3957 | 1064.3333 |
1.067 | 90.3846 | 1175 | 0.7069 | 30.4225 | 20.1908 | 25.9389 | 24.8933 | 1252.5 |
1.067 | 92.3077 | 1200 | 0.6928 | 24.476 | 18.5105 | 21.7682 | 20.6288 | 1419.1667 |
1.067 | 94.2308 | 1225 | 0.6924 | 34.3412 | 23.4853 | 30.2292 | 28.6861 | 1402.6667 |
1.067 | 96.1538 | 1250 | 0.6900 | 36.8626 | 24.9817 | 31.8421 | 30.1301 | 1402.6667 |
1.067 | 98.0769 | 1275 | 0.6838 | 36.4213 | 25.4676 | 32.0357 | 30.8773 | 1098.0 |
1.067 | 100.0 | 1300 | 0.6786 | 27.8196 | 18.9336 | 24.1778 | 23.6849 | 1419.3333 |
1.067 | 101.9231 | 1325 | 0.6758 | 36.6034 | 27.2136 | 31.8824 | 30.9734 | 1094.6667 |
1.067 | 103.8462 | 1350 | 0.6661 | 35.9341 | 26.937 | 31.4599 | 30.4341 | 1096.5 |
1.067 | 105.7692 | 1375 | 0.6649 | 37.8479 | 26.9005 | 33.634 | 32.6227 | 1402.6667 |
1.067 | 107.6923 | 1400 | 0.6592 | 39.6796 | 28.1931 | 33.4312 | 32.7645 | 1119.6667 |
1.067 | 109.6154 | 1425 | 0.6571 | 33.8316 | 23.0668 | 30.0589 | 29.8248 | 1444.0 |
1.067 | 111.5385 | 1450 | 0.6578 | 39.1579 | 29.4923 | 34.6578 | 34.0201 | 1150.8333 |
1.067 | 113.4615 | 1475 | 0.6500 | 42.4567 | 28.9735 | 37.3988 | 35.6678 | 835.3333 |
0.8748 | 115.3846 | 1500 | 0.6451 | 37.9141 | 25.7378 | 33.9276 | 31.9374 | 1192.0 |
0.8748 | 117.3077 | 1525 | 0.6422 | 33.8898 | 24.3709 | 30.1179 | 28.4935 | 1433.8333 |
0.8748 | 119.2308 | 1550 | 0.6368 | 41.9531 | 28.6251 | 36.0198 | 34.2882 | 1161.1667 |
0.8748 | 121.1538 | 1575 | 0.6344 | 37.3211 | 25.806 | 32.6797 | 30.9541 | 1426.0 |
0.8748 | 123.0769 | 1600 | 0.6304 | 42.411 | 29.4476 | 36.0243 | 34.2522 | 844.8333 |
0.8748 | 125.0 | 1625 | 0.6276 | 38.234 | 26.7376 | 31.9694 | 30.1662 | 1282.0 |
0.8748 | 126.9231 | 1650 | 0.6257 | 29.1 | 21.2848 | 26.7081 | 26.0839 | 1582.0 |
0.8748 | 128.8462 | 1675 | 0.6227 | 36.314 | 26.1402 | 31.5141 | 29.605 | 901.0 |
0.8748 | 130.7692 | 1700 | 0.6198 | 35.9501 | 25.86 | 31.4977 | 29.631 | 1426.0 |
0.8748 | 132.6923 | 1725 | 0.6224 | 35.7537 | 25.5328 | 30.8648 | 29.0449 | 1213.1667 |
0.8748 | 134.6154 | 1750 | 0.6172 | 26.1073 | 20.1223 | 23.4478 | 23.2837 | 1415.6667 |
0.8748 | 136.5385 | 1775 | 0.6207 | 29.6349 | 20.975 | 25.1467 | 23.1802 | 1183.1667 |
0.8748 | 138.4615 | 1800 | 0.6137 | 37.7711 | 27.3393 | 33.4935 | 32.0528 | 856.0 |
0.8748 | 140.3846 | 1825 | 0.6100 | 43.4868 | 32.9129 | 38.9293 | 37.6363 | 802.8333 |
0.8748 | 142.3077 | 1850 | 0.6091 | 33.0147 | 23.8658 | 29.2587 | 27.4546 | 1121.3333 |
0.8748 | 144.2308 | 1875 | 0.6075 | 43.7785 | 31.7055 | 38.7407 | 37.2803 | 1118.8333 |
0.8748 | 146.1538 | 1900 | 0.6028 | 26.047 | 19.8575 | 23.2564 | 22.9809 | 1413.6667 |
0.8748 | 148.0769 | 1925 | 0.6020 | 25.8242 | 19.4078 | 23.0936 | 22.7658 | 1307.6667 |
0.8748 | 150.0 | 1950 | 0.6017 | 29.0892 | 22.112 | 26.4017 | 26.2841 | 1203.8333 |
0.8748 | 151.9231 | 1975 | 0.5972 | 35.2639 | 27.0773 | 31.3233 | 31.6303 | 1110.3333 |
0.7957 | 153.8462 | 2000 | 0.5973 | 34.7757 | 25.617 | 31.3525 | 29.5239 | 909.6667 |
0.7957 | 155.7692 | 2025 | 0.5963 | 39.2421 | 28.6572 | 35.3195 | 33.3751 | 1141.5 |
0.7957 | 157.6923 | 2050 | 0.5971 | 40.705 | 28.8564 | 35.7408 | 34.4547 | 825.0 |
0.7957 | 159.6154 | 2075 | 0.5940 | 32.7885 | 24.3525 | 29.2913 | 28.7441 | 1137.0 |
0.7957 | 161.5385 | 2100 | 0.5918 | 36.1103 | 26.7078 | 32.6294 | 32.165 | 1109.3333 |
0.7957 | 163.4615 | 2125 | 0.5901 | 33.3945 | 24.3106 | 29.5642 | 29.1211 | 1122.0 |
0.7957 | 165.3846 | 2150 | 0.5887 | 33.0641 | 25.1483 | 28.5307 | 28.5341 | 1136.5 |
0.7957 | 167.3077 | 2175 | 0.5880 | 33.9058 | 25.3162 | 30.4182 | 30.4334 | 1122.3333 |
0.7957 | 169.2308 | 2200 | 0.5856 | 40.1053 | 29.2168 | 35.5522 | 35.4371 | 823.8333 |
0.7957 | 171.1538 | 2225 | 0.5831 | 40.3565 | 29.3126 | 35.7256 | 36.0169 | 822.5 |
0.7957 | 173.0769 | 2250 | 0.5841 | 32.7059 | 24.2819 | 28.9926 | 29.0977 | 1123.6667 |
0.7957 | 175.0 | 2275 | 0.5833 | 31.6615 | 23.8942 | 28.0682 | 27.8896 | 1131.8333 |
0.7957 | 176.9231 | 2300 | 0.5809 | 33.3448 | 25.3438 | 29.9738 | 29.9541 | 1123.3333 |
0.7957 | 178.8462 | 2325 | 0.5809 | 34.7749 | 25.4859 | 30.8064 | 30.966 | 1133.0 |
0.7957 | 180.7692 | 2350 | 0.5794 | 35.5149 | 25.6719 | 31.4748 | 31.7132 | 1125.5 |
0.7957 | 182.6923 | 2375 | 0.5798 | 37.8505 | 28.7851 | 33.1326 | 33.9326 | 1101.5 |
0.7957 | 184.6154 | 2400 | 0.5795 | 44.9411 | 31.929 | 39.3159 | 40.2195 | 814.8333 |
0.7957 | 186.5385 | 2425 | 0.5785 | 45.3771 | 32.5483 | 39.8906 | 40.6959 | 810.0 |
0.7957 | 188.4615 | 2450 | 0.5779 | 36.6623 | 27.845 | 32.6189 | 33.0543 | 1103.6667 |
0.7957 | 190.3846 | 2475 | 0.5774 | 36.6466 | 28.0217 | 32.5312 | 33.3439 | 843.3333 |
0.7481 | 192.3077 | 2500 | 0.5775 | 36.3607 | 27.8503 | 32.5271 | 32.9644 | 1103.6667 |
0.7481 | 194.2308 | 2525 | 0.5775 | 36.8397 | 27.9327 | 32.6315 | 33.0669 | 1103.6667 |
0.7481 | 196.1538 | 2550 | 0.5776 | 36.7963 | 27.8825 | 33.0842 | 33.2128 | 1103.6667 |
0.7481 | 198.0769 | 2575 | 0.5774 | 36.7436 | 27.845 | 33.0878 | 33.219 | 1103.6667 |
0.7481 | 200.0 | 2600 | 0.5774 | 36.746 | 27.845 | 33.0926 | 33.2212 | 1103.6667 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.5.0+cu121
- Datasets 3.0.2
- Tokenizers 0.19.1
- Downloads last month
- 9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for imhereforthememes/t5-small-fine-tuned_model_4
Base model
google-t5/t5-small