Edit model card

summarization_model_test_full

This model is a fine-tuned version of google/flan-t5-small on the billsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7652
  • Rouge1: 19.6383
  • Rouge2: 11.2053
  • Rougel: 17.3949
  • Rougelsum: 18.5149
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 62 2.1744 20.1855 10.268 17.0388 18.7069 19.0
No log 2.0 124 2.0830 19.9562 10.2364 17.0162 18.5535 19.0
No log 3.0 186 2.0327 19.365 9.9247 16.5556 17.9205 19.0
No log 4.0 248 1.9944 19.7059 10.1539 16.8672 18.2399 19.0
No log 5.0 310 1.9659 20.0813 10.8566 17.2935 18.6275 19.0
No log 6.0 372 1.9366 19.6773 10.4254 17.0455 18.3023 19.0
No log 7.0 434 1.9221 19.6565 10.4774 17.1558 18.2997 19.0
No log 8.0 496 1.8966 19.6239 10.3022 17.0537 18.3526 19.0
2.2025 9.0 558 1.8872 19.3585 10.3302 16.8669 18.1668 19.0
2.2025 10.0 620 1.8697 19.5805 10.3337 17.0132 18.2799 19.0
2.2025 11.0 682 1.8649 19.3848 10.3388 16.8786 18.1003 19.0
2.2025 12.0 744 1.8524 19.7519 10.6495 17.1712 18.4577 19.0
2.2025 13.0 806 1.8435 20.1432 11.1293 17.4232 18.8605 19.0
2.2025 14.0 868 1.8288 19.8406 10.5874 17.1041 18.4982 19.0
2.2025 15.0 930 1.8251 19.1028 10.2219 16.6665 17.9277 19.0
2.2025 16.0 992 1.8181 19.2449 10.3843 16.786 18.0513 19.0
1.8861 17.0 1054 1.8091 19.9139 10.8322 17.1391 18.5886 19.0
1.8861 18.0 1116 1.8064 19.7761 10.8167 17.0647 18.5176 19.0
1.8861 19.0 1178 1.7995 19.8554 11.0223 17.2002 18.6982 19.0
1.8861 20.0 1240 1.7930 19.5597 10.7289 17.011 18.3842 19.0
1.8861 21.0 1302 1.7888 19.1782 10.4075 16.6844 17.9089 19.0
1.8861 22.0 1364 1.7909 19.4924 10.6472 16.9382 18.2204 19.0
1.8861 23.0 1426 1.7891 19.4475 10.7497 16.9434 18.1978 19.0
1.8861 24.0 1488 1.7872 19.8736 11.184 17.3289 18.6547 19.0
1.7266 25.0 1550 1.7811 19.528 10.8734 17.0035 18.2733 19.0
1.7266 26.0 1612 1.7740 19.8775 10.9392 17.3007 18.6535 19.0
1.7266 27.0 1674 1.7719 19.5385 10.7868 17.0496 18.294 19.0
1.7266 28.0 1736 1.7608 19.3455 10.605 16.9156 18.1785 19.0
1.7266 29.0 1798 1.7704 19.5603 10.8755 17.1458 18.3165 19.0
1.7266 30.0 1860 1.7670 19.5976 10.767 17.1435 18.4264 19.0
1.7266 31.0 1922 1.7632 20.0315 11.1991 17.4017 18.878 19.0
1.7266 32.0 1984 1.7592 19.2901 10.3776 16.886 18.0728 19.0
1.612 33.0 2046 1.7608 19.9345 11.2158 17.5101 18.7281 19.0
1.612 34.0 2108 1.7661 19.8895 11.1244 17.3604 18.6366 19.0
1.612 35.0 2170 1.7573 19.527 10.7979 17.2852 18.3765 19.0
1.612 36.0 2232 1.7611 19.825 11.1296 17.4667 18.705 19.0
1.612 37.0 2294 1.7608 19.6718 10.9866 17.1989 18.4438 19.0
1.612 38.0 2356 1.7574 19.8291 11.1143 17.2426 18.5842 19.0
1.612 39.0 2418 1.7592 19.7818 11.3154 17.3337 18.5758 19.0
1.612 40.0 2480 1.7504 19.8648 11.1593 17.3199 18.6069 19.0
1.5209 41.0 2542 1.7585 19.8796 11.2009 17.3867 18.6824 19.0
1.5209 42.0 2604 1.7586 19.5433 10.8156 17.0882 18.2927 19.0
1.5209 43.0 2666 1.7570 19.7238 11.2383 17.3478 18.5807 19.0
1.5209 44.0 2728 1.7501 19.4512 10.7682 17.2254 18.3042 19.0
1.5209 45.0 2790 1.7501 19.7574 11.1604 17.3709 18.5352 19.0
1.5209 46.0 2852 1.7507 19.6208 11.0567 17.3059 18.4639 19.0
1.5209 47.0 2914 1.7529 19.5944 10.907 17.2455 18.4234 19.0
1.5209 48.0 2976 1.7470 20.0562 11.4073 17.5844 18.9184 19.0
1.4532 49.0 3038 1.7594 19.7614 11.21 17.4339 18.6328 19.0
1.4532 50.0 3100 1.7564 19.8331 11.2841 17.4349 18.673 19.0
1.4532 51.0 3162 1.7554 19.8524 11.1447 17.3783 18.6541 19.0
1.4532 52.0 3224 1.7528 19.7425 11.0923 17.3309 18.5151 19.0
1.4532 53.0 3286 1.7613 19.9237 11.3678 17.5919 18.7275 19.0
1.4532 54.0 3348 1.7490 19.6336 10.9842 17.3478 18.5493 19.0
1.4532 55.0 3410 1.7544 19.8248 11.2674 17.4681 18.6744 19.0
1.4532 56.0 3472 1.7533 19.9599 11.3907 17.5344 18.7955 19.0
1.3951 57.0 3534 1.7581 19.8866 11.2337 17.508 18.7827 19.0
1.3951 58.0 3596 1.7536 19.6304 10.9662 17.2659 18.4986 19.0
1.3951 59.0 3658 1.7564 19.7786 11.2141 17.4376 18.6144 19.0
1.3951 60.0 3720 1.7530 19.7982 11.2066 17.3471 18.5927 19.0
1.3951 61.0 3782 1.7582 19.8927 11.3067 17.5022 18.707 19.0
1.3951 62.0 3844 1.7533 19.5306 10.7525 17.1783 18.3809 19.0
1.3951 63.0 3906 1.7579 19.7105 11.1598 17.3115 18.5334 19.0
1.3951 64.0 3968 1.7562 19.8355 11.3164 17.4152 18.6765 19.0
1.3517 65.0 4030 1.7549 19.7557 11.191 17.3871 18.6421 19.0
1.3517 66.0 4092 1.7597 19.8852 11.2811 17.4705 18.7211 19.0
1.3517 67.0 4154 1.7602 19.6477 11.0227 17.2974 18.5146 19.0
1.3517 68.0 4216 1.7606 19.6709 11.0783 17.3564 18.4983 19.0
1.3517 69.0 4278 1.7548 19.7667 11.0008 17.3737 18.5458 19.0
1.3517 70.0 4340 1.7580 19.8392 11.1556 17.4514 18.678 19.0
1.3517 71.0 4402 1.7601 19.7668 11.2518 17.4695 18.6242 19.0
1.3517 72.0 4464 1.7576 19.7156 11.2389 17.3549 18.5532 19.0
1.3221 73.0 4526 1.7598 19.6953 11.2072 17.3965 18.579 19.0
1.3221 74.0 4588 1.7600 19.7549 11.3229 17.4771 18.6686 19.0
1.3221 75.0 4650 1.7602 19.7374 11.2304 17.3936 18.628 19.0
1.3221 76.0 4712 1.7625 19.6828 11.2713 17.4368 18.6089 19.0
1.3221 77.0 4774 1.7572 19.7871 11.2884 17.4626 18.6822 19.0
1.3221 78.0 4836 1.7582 19.7716 11.3186 17.5276 18.6968 19.0
1.3221 79.0 4898 1.7622 19.8097 11.339 17.5288 18.7231 19.0
1.3221 80.0 4960 1.7622 19.6995 11.114 17.4771 18.6018 19.0
1.2961 81.0 5022 1.7636 19.769 11.2326 17.513 18.6577 19.0
1.2961 82.0 5084 1.7568 19.7692 11.2903 17.4994 18.6537 19.0
1.2961 83.0 5146 1.7650 19.7302 11.307 17.468 18.6289 19.0
1.2961 84.0 5208 1.7643 19.6686 11.2042 17.4537 18.5437 19.0
1.2961 85.0 5270 1.7640 19.7238 11.2806 17.4493 18.5998 19.0
1.2961 86.0 5332 1.7631 19.7003 11.1788 17.4315 18.5896 19.0
1.2961 87.0 5394 1.7641 19.8238 11.3948 17.5118 18.6782 19.0
1.2961 88.0 5456 1.7654 19.6419 11.1966 17.4058 18.5255 19.0
1.274 89.0 5518 1.7651 19.5904 11.2484 17.4191 18.5085 19.0
1.274 90.0 5580 1.7652 19.5491 11.1972 17.3626 18.4374 19.0
1.274 91.0 5642 1.7617 19.4972 11.0731 17.2711 18.3751 19.0
1.274 92.0 5704 1.7632 19.5798 11.1521 17.3303 18.4391 19.0
1.274 93.0 5766 1.7636 19.5843 11.1499 17.3484 18.4646 19.0
1.274 94.0 5828 1.7636 19.668 11.2353 17.4066 18.5567 19.0
1.274 95.0 5890 1.7640 19.6222 11.1724 17.3758 18.5105 19.0
1.274 96.0 5952 1.7646 19.6386 11.1999 17.3887 18.5139 19.0
1.2641 97.0 6014 1.7653 19.6783 11.2232 17.4207 18.5636 19.0
1.2641 98.0 6076 1.7651 19.696 11.282 17.4319 18.5786 19.0
1.2641 99.0 6138 1.7654 19.6377 11.1911 17.3946 18.5137 19.0
1.2641 100.0 6200 1.7652 19.6383 11.2053 17.3949 18.5149 19.0

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0+cu118
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
2

Dataset used to train jnelen/summarization_model_test_full

Evaluation results