t5-small-fine-tuned_model_3
This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1560
- Rouge1: 75.4228
- Rouge2: 70.7071
- Rougel: 74.0159
- Rougelsum: 74.2555
- Gen Len: 396.1667
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.005
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 0.7692 | 10 | 1.7251 | 13.6123 | 5.7258 | 13.1801 | 13.1787 | 1027.0 |
No log | 1.5385 | 20 | 1.5442 | 23.3131 | 17.9737 | 23.3131 | 23.3131 | 1027.0 |
No log | 2.3077 | 30 | 1.3803 | 12.2977 | 5.1904 | 11.7431 | 11.5969 | 1027.0 |
No log | 3.0769 | 40 | 1.2344 | 13.8993 | 11.771 | 13.8993 | 14.0091 | 1027.0 |
No log | 3.8462 | 50 | 1.1516 | 13.9042 | 11.8938 | 13.9042 | 14.0103 | 1027.0 |
No log | 4.6154 | 60 | 0.9481 | 14.3687 | 10.4481 | 13.1919 | 13.0214 | 876.6667 |
No log | 5.3846 | 70 | 0.9286 | 27.0525 | 14.6747 | 24.9583 | 24.3756 | 857.5 |
No log | 6.1538 | 80 | 0.8804 | 21.6353 | 12.8534 | 18.4791 | 18.4306 | 877.1667 |
No log | 6.9231 | 90 | 0.7841 | 47.5579 | 30.341 | 42.5411 | 42.7482 | 550.5 |
No log | 7.6923 | 100 | 0.7793 | 35.2203 | 25.009 | 31.3145 | 30.6642 | 859.8333 |
No log | 8.4615 | 110 | 0.6860 | 37.2436 | 29.1438 | 33.1425 | 32.729 | 859.6667 |
No log | 9.2308 | 120 | 0.7150 | 29.122 | 23.7579 | 27.5853 | 26.5771 | 859.6667 |
No log | 10.0 | 130 | 0.6579 | 51.6814 | 37.1169 | 47.9067 | 47.9272 | 530.1667 |
No log | 10.7692 | 140 | 0.6267 | 37.5717 | 28.0617 | 32.827 | 32.592 | 860.5 |
No log | 11.5385 | 150 | 0.6118 | 62.1203 | 49.9121 | 55.4072 | 54.9256 | 564.0 |
No log | 12.3077 | 160 | 0.5481 | 61.2435 | 49.738 | 55.7893 | 55.6371 | 565.1667 |
No log | 13.0769 | 170 | 0.5685 | 57.4855 | 47.8398 | 54.6011 | 53.7537 | 407.8333 |
No log | 13.8462 | 180 | 0.5603 | 63.7808 | 52.0648 | 58.9732 | 59.1514 | 107.6667 |
No log | 14.6154 | 190 | 0.4906 | 56.541 | 43.5496 | 50.0309 | 49.4554 | 402.3333 |
No log | 15.3846 | 200 | 0.4920 | 44.085 | 31.8595 | 41.9242 | 42.2744 | 130.6667 |
No log | 16.1538 | 210 | 0.4519 | 57.8642 | 47.346 | 53.4872 | 53.7607 | 294.0 |
No log | 16.9231 | 220 | 0.4319 | 44.5213 | 29.3385 | 36.2116 | 36.1914 | 481.0 |
No log | 17.6923 | 230 | 0.4147 | 52.262 | 33.4537 | 42.1175 | 42.8641 | 335.6667 |
No log | 18.4615 | 240 | 0.4411 | 33.5609 | 21.155 | 26.7958 | 27.0263 | 785.1667 |
No log | 19.2308 | 250 | 0.3791 | 62.4765 | 48.3805 | 56.5917 | 55.6696 | 301.6667 |
No log | 20.0 | 260 | 0.3913 | 66.6348 | 54.4823 | 59.6097 | 59.9255 | 144.5 |
No log | 20.7692 | 270 | 0.3530 | 54.5169 | 46.9471 | 50.0583 | 49.4563 | 431.1667 |
No log | 21.5385 | 280 | 0.3245 | 46.7808 | 38.4793 | 42.4197 | 42.2085 | 712.3333 |
No log | 22.3077 | 290 | 0.3368 | 47.0382 | 35.8428 | 39.104 | 38.5503 | 735.8333 |
No log | 23.0769 | 300 | 0.3297 | 53.7986 | 44.0834 | 46.4405 | 47.5762 | 654.3333 |
No log | 23.8462 | 310 | 0.2940 | 59.8414 | 45.4853 | 53.3007 | 53.3967 | 155.8333 |
No log | 24.6154 | 320 | 0.3340 | 65.9 | 52.8727 | 59.5371 | 59.484 | 227.1667 |
No log | 25.3846 | 330 | 0.2812 | 58.7644 | 47.4464 | 51.8233 | 52.0057 | 302.5 |
No log | 26.1538 | 340 | 0.2787 | 64.4588 | 51.5866 | 56.3922 | 55.9368 | 219.8333 |
No log | 26.9231 | 350 | 0.2872 | 55.727 | 45.0152 | 49.2849 | 49.2548 | 601.0 |
No log | 27.6923 | 360 | 0.2971 | 63.8289 | 52.3489 | 57.6671 | 57.2489 | 361.0 |
No log | 28.4615 | 370 | 0.2893 | 60.4914 | 49.2527 | 54.2347 | 54.2306 | 174.1667 |
No log | 29.2308 | 380 | 0.2479 | 65.7383 | 53.9204 | 57.7956 | 58.0014 | 304.0 |
No log | 30.0 | 390 | 0.2452 | 58.2415 | 49.1706 | 49.5983 | 49.1554 | 630.1667 |
No log | 30.7692 | 400 | 0.2504 | 54.9945 | 42.7543 | 45.5489 | 46.7113 | 664.3333 |
No log | 31.5385 | 410 | 0.2361 | 62.8874 | 47.848 | 52.1486 | 52.6791 | 439.3333 |
No log | 32.3077 | 420 | 0.2282 | 35.307 | 20.7981 | 25.3321 | 25.7283 | 648.3333 |
No log | 33.0769 | 430 | 0.2268 | 39.9343 | 26.2938 | 32.5539 | 32.5389 | 464.8333 |
No log | 33.8462 | 440 | 0.2160 | 37.5551 | 29.1716 | 36.4583 | 36.3205 | 23.0 |
No log | 34.6154 | 450 | 0.2049 | 43.1026 | 33.2667 | 40.7167 | 40.7024 | 108.0 |
No log | 35.3846 | 460 | 0.2006 | 61.876 | 50.0227 | 53.1594 | 53.2425 | 502.6667 |
No log | 36.1538 | 470 | 0.1934 | 60.7038 | 50.0727 | 55.2509 | 54.8126 | 338.8333 |
No log | 36.9231 | 480 | 0.1960 | 70.3567 | 56.2927 | 61.7649 | 62.2948 | 358.6667 |
No log | 37.6923 | 490 | 0.1792 | 59.3192 | 42.9024 | 47.1844 | 47.5165 | 355.5 |
0.5531 | 38.4615 | 500 | 0.1755 | 58.8161 | 44.5037 | 47.7178 | 47.6386 | 501.5 |
0.5531 | 39.2308 | 510 | 0.1892 | 54.0773 | 43.7896 | 47.246 | 47.0727 | 440.1667 |
0.5531 | 40.0 | 520 | 0.1821 | 57.2344 | 46.5657 | 52.5641 | 52.5542 | 589.1667 |
0.5531 | 40.7692 | 530 | 0.1729 | 68.5089 | 53.586 | 60.131 | 60.3304 | 292.6667 |
0.5531 | 41.5385 | 540 | 0.1989 | 63.9246 | 51.624 | 55.4652 | 55.8813 | 355.3333 |
0.5531 | 42.3077 | 550 | 0.1868 | 60.7441 | 50.1997 | 55.0352 | 53.7644 | 564.3333 |
0.5531 | 43.0769 | 560 | 0.1570 | 44.0831 | 33.923 | 37.6398 | 37.451 | 748.6667 |
0.5531 | 43.8462 | 570 | 0.1806 | 60.5725 | 47.5269 | 52.2245 | 53.3507 | 487.8333 |
0.5531 | 44.6154 | 580 | 0.1984 | 64.7623 | 56.5668 | 58.7952 | 59.3482 | 527.1667 |
0.5531 | 45.3846 | 590 | 0.1673 | 62.8231 | 50.6443 | 53.4276 | 53.4813 | 385.8333 |
0.5531 | 46.1538 | 600 | 0.1593 | 77.1493 | 70.2538 | 73.9133 | 74.0634 | 336.5 |
0.5531 | 46.9231 | 610 | 0.1787 | 69.6579 | 57.144 | 62.8631 | 63.1825 | 264.1667 |
0.5531 | 47.6923 | 620 | 0.1579 | 67.3991 | 55.4929 | 60.496 | 59.9907 | 237.5 |
0.5531 | 48.4615 | 630 | 0.1510 | 55.7614 | 52.4735 | 54.2066 | 54.4553 | 351.3333 |
0.5531 | 49.2308 | 640 | 0.1490 | 66.8343 | 59.1175 | 62.6098 | 62.6185 | 489.1667 |
0.5531 | 50.0 | 650 | 0.1450 | 73.7447 | 68.8381 | 72.2138 | 71.7347 | 403.1667 |
0.5531 | 50.7692 | 660 | 0.1435 | 73.4612 | 62.1625 | 67.6424 | 67.8374 | 335.0 |
0.5531 | 51.5385 | 670 | 0.1412 | 69.9245 | 63.2467 | 67.5193 | 66.7139 | 459.3333 |
0.5531 | 52.3077 | 680 | 0.1537 | 67.309 | 56.0056 | 60.5465 | 60.7674 | 483.3333 |
0.5531 | 53.0769 | 690 | 0.1618 | 66.0585 | 54.5418 | 60.2616 | 59.8329 | 391.1667 |
0.5531 | 53.8462 | 700 | 0.1546 | 62.9813 | 57.9394 | 61.4801 | 60.8618 | 532.5 |
0.5531 | 54.6154 | 710 | 0.1768 | 69.2968 | 62.2167 | 65.5068 | 65.6779 | 463.5 |
0.5531 | 55.3846 | 720 | 0.1523 | 70.6019 | 64.4629 | 68.7182 | 68.6705 | 468.3333 |
0.5531 | 56.1538 | 730 | 0.1452 | 74.6336 | 70.8117 | 73.3083 | 73.5846 | 427.5 |
0.5531 | 56.9231 | 740 | 0.1458 | 80.2581 | 73.4241 | 77.8048 | 78.2945 | 321.5 |
0.5531 | 57.6923 | 750 | 0.1454 | 69.5709 | 60.7631 | 64.0057 | 64.1665 | 438.5 |
0.5531 | 58.4615 | 760 | 0.1440 | 74.8974 | 70.6795 | 73.4561 | 73.6899 | 415.6667 |
0.5531 | 59.2308 | 770 | 0.1420 | 75.8343 | 70.7545 | 74.2487 | 74.3303 | 370.8333 |
0.5531 | 60.0 | 780 | 0.1518 | 68.975 | 60.6509 | 63.3542 | 63.4528 | 488.0 |
0.5531 | 60.7692 | 790 | 0.1329 | 75.4609 | 65.9764 | 70.407 | 70.9722 | 379.6667 |
0.5531 | 61.5385 | 800 | 0.1298 | 75.6475 | 67.6634 | 72.3407 | 72.6996 | 405.3333 |
0.5531 | 62.3077 | 810 | 0.1324 | 76.1183 | 68.3992 | 73.0096 | 73.3558 | 379.3333 |
0.5531 | 63.0769 | 820 | 0.1469 | 61.1852 | 57.2433 | 60.7155 | 60.4608 | 675.1667 |
0.5531 | 63.8462 | 830 | 0.1385 | 68.2356 | 60.6576 | 63.8079 | 63.9332 | 513.1667 |
0.5531 | 64.6154 | 840 | 0.1434 | 71.3804 | 66.5798 | 69.5366 | 69.5204 | 508.0 |
0.5531 | 65.3846 | 850 | 0.1557 | 63.2252 | 59.4299 | 61.8559 | 61.89 | 537.0 |
0.5531 | 66.1538 | 860 | 0.1489 | 74.2213 | 68.7578 | 72.1378 | 72.1929 | 472.1667 |
0.5531 | 66.9231 | 870 | 0.1582 | 79.3572 | 72.5039 | 77.4724 | 77.8716 | 324.5 |
0.5531 | 67.6923 | 880 | 0.1419 | 70.4109 | 65.0778 | 68.5519 | 68.6548 | 523.0 |
0.5531 | 68.4615 | 890 | 0.1403 | 75.0692 | 67.5111 | 72.954 | 73.2228 | 379.3333 |
0.5531 | 69.2308 | 900 | 0.1411 | 74.8948 | 66.439 | 72.7139 | 73.0614 | 383.0 |
0.5531 | 70.0 | 910 | 0.1423 | 79.3572 | 71.8921 | 77.4724 | 77.8716 | 325.5 |
0.5531 | 70.7692 | 920 | 0.1398 | 79.3572 | 72.135 | 77.4724 | 77.8716 | 325.5 |
0.5531 | 71.5385 | 930 | 0.1376 | 75.2809 | 70.7071 | 73.6409 | 73.8805 | 410.0 |
0.5531 | 72.3077 | 940 | 0.1440 | 75.7518 | 70.6157 | 74.2567 | 74.4963 | 381.0 |
0.5531 | 73.0769 | 950 | 0.1434 | 80.9338 | 73.4733 | 78.7226 | 79.3074 | 319.1667 |
0.5531 | 73.8462 | 960 | 0.1403 | 80.33 | 73.1042 | 78.1987 | 78.7715 | 321.0 |
0.5531 | 74.6154 | 970 | 0.1393 | 75.7518 | 70.7071 | 74.2151 | 74.4547 | 377.6667 |
0.5531 | 75.3846 | 980 | 0.1363 | 75.2169 | 70.6795 | 73.6694 | 73.9091 | 414.1667 |
0.5531 | 76.1538 | 990 | 0.1392 | 75.7518 | 70.7639 | 74.5831 | 74.8227 | 371.6667 |
0.0743 | 76.9231 | 1000 | 0.1457 | 75.8091 | 71.008 | 74.7065 | 74.9461 | 369.5 |
0.0743 | 77.6923 | 1010 | 0.1476 | 75.6793 | 70.7662 | 74.2724 | 74.512 | 389.0 |
0.0743 | 78.4615 | 1020 | 0.1504 | 74.9721 | 70.6949 | 73.5623 | 73.6876 | 419.8333 |
0.0743 | 79.2308 | 1030 | 0.1488 | 74.9721 | 70.6949 | 73.5623 | 73.6876 | 419.8333 |
0.0743 | 80.0 | 1040 | 0.1457 | 67.2012 | 63.9833 | 66.413 | 66.8448 | 518.6667 |
0.0743 | 80.7692 | 1050 | 0.1411 | 75.0783 | 70.1206 | 73.56 | 73.7876 | 416.8333 |
0.0743 | 81.5385 | 1060 | 0.1444 | 74.9181 | 70.6595 | 73.5353 | 73.7381 | 430.0 |
0.0743 | 82.3077 | 1070 | 0.1661 | 75.252 | 70.7071 | 73.6151 | 73.8548 | 412.0 |
0.0743 | 83.0769 | 1080 | 0.1686 | 75.7518 | 71.0652 | 74.5395 | 74.7791 | 367.6667 |
0.0743 | 83.8462 | 1090 | 0.1691 | 75.0598 | 70.7071 | 73.5513 | 73.7701 | 417.1667 |
0.0743 | 84.6154 | 1100 | 0.1678 | 74.9666 | 70.6637 | 73.4386 | 73.6548 | 423.0 |
0.0743 | 85.3846 | 1110 | 0.1671 | 74.7224 | 70.4484 | 73.3686 | 73.5149 | 453.0 |
0.0743 | 86.1538 | 1120 | 0.1656 | 75.7518 | 70.7717 | 74.3526 | 74.5922 | 378.3333 |
0.0743 | 86.9231 | 1130 | 0.1643 | 75.7518 | 70.7717 | 74.3526 | 74.5922 | 378.5 |
0.0743 | 87.6923 | 1140 | 0.1596 | 75.7518 | 70.7717 | 74.3526 | 74.5922 | 378.5 |
0.0743 | 88.4615 | 1150 | 0.1592 | 75.2818 | 70.7071 | 73.7514 | 73.991 | 403.1667 |
0.0743 | 89.2308 | 1160 | 0.1607 | 75.2883 | 70.7071 | 73.6474 | 73.887 | 410.3333 |
0.0743 | 90.0 | 1170 | 0.1600 | 75.0598 | 70.7071 | 73.5513 | 73.7701 | 417.1667 |
0.0743 | 90.7692 | 1180 | 0.1571 | 75.3879 | 70.7071 | 73.981 | 74.2206 | 397.0 |
0.0743 | 91.5385 | 1190 | 0.1561 | 75.3966 | 70.7071 | 73.9896 | 74.2292 | 396.8333 |
0.0743 | 92.3077 | 1200 | 0.1556 | 75.3794 | 70.7071 | 73.9724 | 74.2121 | 398.3333 |
0.0743 | 93.0769 | 1210 | 0.1555 | 75.4228 | 70.7071 | 74.0159 | 74.2555 | 396.1667 |
0.0743 | 93.8462 | 1220 | 0.1556 | 75.4228 | 70.7071 | 74.0159 | 74.2555 | 396.1667 |
0.0743 | 94.6154 | 1230 | 0.1557 | 75.4228 | 70.7071 | 74.0159 | 74.2555 | 396.1667 |
0.0743 | 95.3846 | 1240 | 0.1558 | 75.4228 | 70.7071 | 74.0159 | 74.2555 | 396.1667 |
0.0743 | 96.1538 | 1250 | 0.1559 | 75.4228 | 70.7071 | 74.0159 | 74.2555 | 396.1667 |
0.0743 | 96.9231 | 1260 | 0.1559 | 75.4228 | 70.7071 | 74.0159 | 74.2555 | 396.1667 |
0.0743 | 97.6923 | 1270 | 0.1559 | 75.4228 | 70.7071 | 74.0159 | 74.2555 | 396.1667 |
0.0743 | 98.4615 | 1280 | 0.1560 | 75.4228 | 70.7071 | 74.0159 | 74.2555 | 396.1667 |
0.0743 | 99.2308 | 1290 | 0.1560 | 75.7518 | 70.9213 | 74.5831 | 74.8227 | 370.3333 |
0.0743 | 100.0 | 1300 | 0.1560 | 75.4228 | 70.7071 | 74.0159 | 74.2555 | 396.1667 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.5.0+cu121
- Datasets 3.0.2
- Tokenizers 0.19.1
- Downloads last month
- 117
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for imhereforthememes/t5-small-fine-tuned_model_3
Base model
google-t5/t5-small