Edit model card

arxiv-summarization-t5-base-2022-09-21

This model is a fine-tuned version of t5-base on the ccdv/arxiv-summarization dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8650
  • Rouge1: 40.6781
  • Rouge2: 14.7167
  • Rougel: 26.6375
  • Rougelsum: 35.5959
  • Gen Len: 117.1969

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.3291 0.05 10000 2.1906 18.6571 7.1341 14.8347 16.9545 19.0
2.2454 0.1 20000 2.1549 18.5037 7.1908 14.7141 16.8233 18.9997
2.2107 0.15 30000 2.1013 18.7638 7.326 14.9437 17.072 19.0
2.1486 0.2 40000 2.0845 18.6879 7.2441 14.8835 16.983 19.0
2.158 0.25 50000 2.0699 18.8314 7.3712 15.0166 17.1215 19.0
2.1476 0.3 60000 2.0424 18.9783 7.4138 15.1121 17.2778 18.9981
2.1164 0.34 70000 2.0349 18.9257 7.4649 15.0335 17.1819 19.0
2.079 0.39 80000 2.0208 18.643 7.4096 14.8927 16.9786 18.9994
2.101 0.44 90000 2.0113 19.3881 7.7012 15.3981 17.6516 19.0
2.0576 0.49 100000 2.0022 18.9985 7.542 15.1157 17.2972 18.9992
2.0983 0.54 110000 1.9941 18.7691 7.4625 15.0256 17.1146 19.0
2.053 0.59 120000 1.9855 19.002 7.5602 15.1497 17.2963 19.0
2.0434 0.64 130000 1.9786 19.2385 7.6533 15.3094 17.5439 18.9994
2.0354 0.69 140000 1.9746 19.184 7.7307 15.2897 17.491 18.9992
2.0347 0.74 150000 1.9639 19.2408 7.693 15.3357 17.5297 19.0
2.0236 0.79 160000 1.9590 19.0781 7.6256 15.1932 17.3486 18.9998
2.0187 0.84 170000 1.9532 19.0343 7.6792 15.1884 17.3519 19.0
1.9939 0.89 180000 1.9485 18.8247 7.5005 15.0246 17.1485 18.9998
1.9961 0.94 190000 1.9504 19.0695 7.6559 15.2139 17.3814 19.0
2.0197 0.99 200000 1.9399 19.2821 7.6685 15.3029 17.5374 18.9988
1.9457 1.03 210000 1.9350 19.053 7.6502 15.2123 17.3793 19.0
1.9552 1.08 220000 1.9317 19.1878 7.7235 15.3272 17.5252 18.9998
1.9772 1.13 230000 1.9305 19.0855 7.6303 15.1943 17.3942 18.9997
1.9171 1.18 240000 1.9291 19.0711 7.6437 15.2175 17.3893 18.9995
1.9393 1.23 250000 1.9230 19.276 7.725 15.3826 17.586 18.9995
1.9295 1.28 260000 1.9197 19.2999 7.7958 15.3961 17.6056 18.9975
1.9725 1.33 270000 1.9173 19.2958 7.7121 15.3659 17.584 19.0
1.9668 1.38 280000 1.9129 19.089 7.6846 15.2395 17.3879 18.9998
1.941 1.43 290000 1.9132 19.2127 7.7336 15.311 17.4742 18.9995
1.9427 1.48 300000 1.9108 19.217 7.7591 15.334 17.53 18.9998
1.9521 1.53 310000 1.9041 19.1285 7.6736 15.2625 17.458 19.0
1.9352 1.58 320000 1.9041 19.1656 7.723 15.3035 17.4818 18.9991
1.9342 1.63 330000 1.9004 19.2573 7.7766 15.3558 17.5382 19.0
1.9631 1.68 340000 1.8978 19.236 7.7584 15.3408 17.4993 18.9998
1.8987 1.72 350000 1.8968 19.1716 7.7231 15.2836 17.4655 18.9997
1.9433 1.77 360000 1.8924 19.2644 7.8294 15.4018 17.5808 18.9998
1.9159 1.82 370000 1.8912 19.1833 7.8267 15.3175 17.4918 18.9995
1.9516 1.87 380000 1.8856 19.3077 7.7432 15.3723 17.6115 19.0
1.9218 1.92 390000 1.8880 19.2668 7.8231 15.3834 17.5701 18.9994
1.9159 1.97 400000 1.8860 19.2224 7.7903 15.3488 17.4992 18.9997
1.8741 2.02 410000 1.8854 19.2572 7.741 15.3405 17.5351 19.0
1.8668 2.07 420000 1.8854 19.3658 7.8593 15.4418 17.656 18.9995
1.8638 2.12 430000 1.8831 19.305 7.8218 15.3843 17.5861 18.9997
1.8334 2.17 440000 1.8817 19.3269 7.8249 15.4231 17.5958 18.9994
1.8893 2.22 450000 1.8803 19.2949 7.7885 15.3947 17.585 18.9997
1.8929 2.27 460000 1.8783 19.291 7.8346 15.428 17.5797 18.9997
1.861 2.32 470000 1.8766 19.4284 7.8832 15.4746 17.6946 18.9997
1.8719 2.37 480000 1.8751 19.1525 7.7641 15.3348 17.47 18.9998
1.8889 2.41 490000 1.8742 19.1743 7.768 15.3292 17.4665 18.9998
1.8834 2.46 500000 1.8723 19.3069 7.7935 15.3987 17.5913 18.9998
1.8564 2.51 510000 1.8695 19.3217 7.8292 15.4063 17.6081 19.0
1.8706 2.56 520000 1.8697 19.294 7.8217 15.3964 17.581 18.9998
1.883 2.61 530000 1.8703 19.2784 7.8634 15.404 17.5942 18.9995
1.8622 2.66 540000 1.8677 19.3165 7.8378 15.4259 17.6064 18.9988
1.8781 2.71 550000 1.8676 19.3237 7.7954 15.3995 17.6008 19.0
1.8793 2.76 560000 1.8685 19.2141 7.7605 15.3345 17.5268 18.9997
1.8795 2.81 570000 1.8675 19.2694 7.8082 15.3996 17.5831 19.0
1.8425 2.86 580000 1.8659 19.2886 7.7987 15.4005 17.5859 18.9997
1.8605 2.91 590000 1.8650 19.2778 7.7934 15.3931 17.5809 18.9997
1.8448 2.96 600000 1.8655 19.2884 7.8087 15.4025 17.5856 19.0

Framework versions

  • Transformers 4.23.0.dev0
  • Pytorch 1.12.0
  • Datasets 2.5.1
  • Tokenizers 0.13.0
Downloads last month
32

Dataset used to train farleyknight/arxiv-summarization-t5-base-2022-09-21

Evaluation results