Edit model card

my_awesome_billsum_model_26

This model is a fine-tuned version of google-t5/t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2944
  • Rouge1: 0.9821
  • Rouge2: 0.9347
  • Rougel: 0.9494
  • Rougelsum: 0.9511
  • Gen Len: 5.2708

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 12 2.0408 0.4016 0.2781 0.3809 0.3805 17.4792
No log 2.0 24 1.4527 0.4407 0.3104 0.4119 0.412 16.3125
No log 3.0 36 0.8914 0.6139 0.5031 0.5902 0.5874 12.2292
No log 4.0 48 0.5897 0.9653 0.8808 0.9235 0.9251 5.0208
No log 5.0 60 0.5210 0.9702 0.8931 0.9291 0.9311 5.0417
No log 6.0 72 0.4877 0.968 0.8841 0.9215 0.9241 5.0625
No log 7.0 84 0.4571 0.9724 0.8944 0.9327 0.9343 5.1458
No log 8.0 96 0.4342 0.9724 0.8944 0.9327 0.9343 5.1458
No log 9.0 108 0.4129 0.9724 0.8944 0.9327 0.9343 5.1458
No log 10.0 120 0.3946 0.9701 0.8859 0.9215 0.9219 5.1667
No log 11.0 132 0.3824 0.9707 0.8967 0.9308 0.9323 5.0833
No log 12.0 144 0.3732 0.9678 0.8723 0.9142 0.9157 5.1042
No log 13.0 156 0.3597 0.9678 0.8723 0.9142 0.9157 5.1042
No log 14.0 168 0.3501 0.9678 0.8723 0.9142 0.9157 5.1042
No log 15.0 180 0.3391 0.9713 0.8845 0.9236 0.9236 5.125
No log 16.0 192 0.3338 0.9713 0.8845 0.9236 0.9236 5.125
No log 17.0 204 0.3271 0.9713 0.8845 0.9236 0.9236 5.125
No log 18.0 216 0.3251 0.9713 0.8845 0.9236 0.9236 5.125
No log 19.0 228 0.3243 0.9713 0.8845 0.9236 0.9236 5.125
No log 20.0 240 0.3229 0.9713 0.8773 0.9236 0.9236 5.125
No log 21.0 252 0.3229 0.9713 0.8773 0.9236 0.9236 5.125
No log 22.0 264 0.3182 0.9713 0.8773 0.9236 0.9236 5.125
No log 23.0 276 0.3128 0.9713 0.8773 0.9236 0.9236 5.125
No log 24.0 288 0.3104 0.969 0.8773 0.9224 0.9225 5.1458
No log 25.0 300 0.3100 0.969 0.8773 0.9224 0.9225 5.1458
No log 26.0 312 0.3078 0.969 0.8773 0.9224 0.9225 5.1458
No log 27.0 324 0.3076 0.969 0.8773 0.9224 0.9225 5.1458
No log 28.0 336 0.3063 0.966 0.875 0.9204 0.9211 5.1667
No log 29.0 348 0.3014 0.9692 0.8891 0.9291 0.9311 5.1875
No log 30.0 360 0.3018 0.9692 0.8891 0.9291 0.9311 5.1875
No log 31.0 372 0.3007 0.9692 0.8891 0.9291 0.9311 5.1875
No log 32.0 384 0.2968 0.9692 0.8891 0.9291 0.9311 5.1875
No log 33.0 396 0.2931 0.9692 0.8891 0.9291 0.9311 5.1875
No log 34.0 408 0.2909 0.9692 0.8891 0.9291 0.9311 5.1875
No log 35.0 420 0.2893 0.9692 0.8891 0.9291 0.9311 5.1875
No log 36.0 432 0.2881 0.9692 0.8891 0.9291 0.9311 5.1875
No log 37.0 444 0.2881 0.9692 0.8891 0.9291 0.9311 5.1875
No log 38.0 456 0.2877 0.9692 0.8891 0.9291 0.9311 5.1875
No log 39.0 468 0.2905 0.9692 0.8891 0.9291 0.9311 5.1875
No log 40.0 480 0.2900 0.9692 0.8891 0.9291 0.9311 5.1875
No log 41.0 492 0.2901 0.9692 0.8891 0.9291 0.9311 5.1875
0.4635 42.0 504 0.2904 0.9754 0.8931 0.9315 0.9328 5.2292
0.4635 43.0 516 0.2885 0.9692 0.8891 0.9291 0.9311 5.1875
0.4635 44.0 528 0.2895 0.9692 0.8891 0.9291 0.9311 5.1875
0.4635 45.0 540 0.2898 0.9724 0.9091 0.9437 0.9452 5.2083
0.4635 46.0 552 0.2869 0.9724 0.9091 0.9437 0.9452 5.2083
0.4635 47.0 564 0.2880 0.9724 0.9091 0.9437 0.9452 5.2083
0.4635 48.0 576 0.2893 0.9724 0.9091 0.9385 0.9402 5.2083
0.4635 49.0 588 0.2916 0.9724 0.9091 0.9437 0.9452 5.2083
0.4635 50.0 600 0.2903 0.9724 0.9091 0.9385 0.9402 5.2083
0.4635 51.0 612 0.2870 0.9724 0.9091 0.9385 0.9402 5.2083
0.4635 52.0 624 0.2856 0.9724 0.8946 0.9335 0.935 5.2083
0.4635 53.0 636 0.2835 0.9715 0.8972 0.9314 0.9327 5.1667
0.4635 54.0 648 0.2844 0.9724 0.9091 0.9385 0.9402 5.2083
0.4635 55.0 660 0.2873 0.9724 0.9091 0.9385 0.9402 5.2083
0.4635 56.0 672 0.2915 0.9756 0.9306 0.9477 0.9494 5.2292
0.4635 57.0 684 0.2938 0.9756 0.9306 0.9477 0.9494 5.2292
0.4635 58.0 696 0.2934 0.9756 0.9306 0.9477 0.9494 5.2292
0.4635 59.0 708 0.2890 0.9756 0.9306 0.9477 0.9494 5.2292
0.4635 60.0 720 0.2858 0.9756 0.9306 0.9477 0.9494 5.2292
0.4635 61.0 732 0.2881 0.9756 0.9306 0.9477 0.9494 5.2292
0.4635 62.0 744 0.2889 0.9756 0.9306 0.9477 0.9494 5.2292
0.4635 63.0 756 0.2878 0.9724 0.9091 0.9385 0.9402 5.2083
0.4635 64.0 768 0.2904 0.979 0.9134 0.9402 0.942 5.25
0.4635 65.0 780 0.2917 0.979 0.9134 0.9402 0.942 5.25
0.4635 66.0 792 0.2919 0.979 0.9134 0.9402 0.942 5.25
0.4635 67.0 804 0.2893 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 68.0 816 0.2894 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 69.0 828 0.2876 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 70.0 840 0.2913 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 71.0 852 0.2912 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 72.0 864 0.2935 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 73.0 876 0.2962 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 74.0 888 0.2987 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 75.0 900 0.2987 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 76.0 912 0.2972 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 77.0 924 0.2979 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 78.0 936 0.2992 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 79.0 948 0.3006 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 80.0 960 0.3000 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 81.0 972 0.2975 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 82.0 984 0.2958 0.9821 0.9347 0.9494 0.9511 5.2708
0.4635 83.0 996 0.2954 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 84.0 1008 0.2949 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 85.0 1020 0.2933 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 86.0 1032 0.2931 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 87.0 1044 0.2927 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 88.0 1056 0.2910 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 89.0 1068 0.2909 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 90.0 1080 0.2910 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 91.0 1092 0.2923 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 92.0 1104 0.2926 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 93.0 1116 0.2928 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 94.0 1128 0.2929 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 95.0 1140 0.2929 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 96.0 1152 0.2931 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 97.0 1164 0.2939 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 98.0 1176 0.2942 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 99.0 1188 0.2944 0.9821 0.9347 0.9494 0.9511 5.2708
0.0955 100.0 1200 0.2944 0.9821 0.9347 0.9494 0.9511 5.2708

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
60.5M params
Tensor type
F32
·

Finetuned from