Edit model card

my_awesome_billsum_model_80

This model is a fine-tuned version of google-t5/t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1923
  • Rouge1: 0.9697
  • Rouge2: 0.8445
  • Rougel: 0.9199
  • Rougelsum: 0.9179
  • Gen Len: 4.9583

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 12 2.0545 0.4101 0.2839 0.3907 0.3895 16.8125
No log 2.0 24 1.4437 0.442 0.3195 0.4261 0.4245 15.9583
No log 3.0 36 0.8267 0.5727 0.4315 0.541 0.5416 12.8125
No log 4.0 48 0.5186 0.9583 0.8429 0.9113 0.91 5.25
No log 5.0 60 0.4535 0.9739 0.8607 0.9276 0.9271 4.875
No log 6.0 72 0.4258 0.9769 0.8768 0.9365 0.9365 4.8958
No log 7.0 84 0.4014 0.9798 0.8869 0.9454 0.9464 4.9167
No log 8.0 96 0.3779 0.9798 0.8869 0.9454 0.9464 4.9167
No log 9.0 108 0.3663 0.9769 0.8726 0.9365 0.9375 4.9375
No log 10.0 120 0.3554 0.9687 0.8444 0.922 0.9226 5.0
No log 11.0 132 0.3461 0.9687 0.8444 0.922 0.9226 5.0
No log 12.0 144 0.3339 0.9716 0.8569 0.9314 0.9314 4.9792
No log 13.0 156 0.3242 0.9716 0.8569 0.9314 0.9314 4.9792
No log 14.0 168 0.3155 0.9716 0.8569 0.9314 0.9314 4.9792
No log 15.0 180 0.3030 0.9716 0.8569 0.9314 0.9314 4.9792
No log 16.0 192 0.2979 0.9676 0.8361 0.9193 0.9173 5.0
No log 17.0 204 0.2957 0.9676 0.8361 0.9193 0.9173 5.0
No log 18.0 216 0.2950 0.9676 0.8361 0.9193 0.9173 5.0
No log 19.0 228 0.2840 0.9676 0.8361 0.9193 0.9173 5.0
No log 20.0 240 0.2778 0.9676 0.8361 0.9193 0.9173 5.0
No log 21.0 252 0.2662 0.9676 0.8361 0.9193 0.9173 5.0
No log 22.0 264 0.2609 0.9676 0.8361 0.9193 0.9173 5.0
No log 23.0 276 0.2587 0.9676 0.8361 0.9193 0.9173 5.0
No log 24.0 288 0.2567 0.9676 0.8361 0.9193 0.9173 5.0
No log 25.0 300 0.2604 0.9676 0.8361 0.9193 0.9173 5.0
No log 26.0 312 0.2540 0.9676 0.8361 0.9193 0.9173 5.0
No log 27.0 324 0.2514 0.9676 0.8361 0.9193 0.9173 5.0
No log 28.0 336 0.2437 0.9676 0.8361 0.9193 0.9173 5.0
No log 29.0 348 0.2370 0.9676 0.8361 0.9193 0.9173 5.0
No log 30.0 360 0.2369 0.9676 0.8361 0.9193 0.9173 5.0
No log 31.0 372 0.2347 0.9676 0.8361 0.9193 0.9173 5.0
No log 32.0 384 0.2329 0.9676 0.8361 0.9193 0.9173 5.0
No log 33.0 396 0.2327 0.9676 0.8361 0.9193 0.9173 5.0
No log 34.0 408 0.2271 0.9676 0.8361 0.9193 0.9173 5.0
No log 35.0 420 0.2231 0.9676 0.8361 0.9193 0.9173 5.0
No log 36.0 432 0.2177 0.9676 0.8361 0.9193 0.9173 5.0
No log 37.0 444 0.2168 0.9676 0.8361 0.9193 0.9173 5.0
No log 38.0 456 0.2154 0.971 0.8468 0.9222 0.9202 4.9583
No log 39.0 468 0.2187 0.9676 0.8361 0.9193 0.9173 5.0
No log 40.0 480 0.2202 0.971 0.8468 0.9222 0.9202 4.9583
No log 41.0 492 0.2164 0.971 0.8468 0.9222 0.9202 4.9583
0.4771 42.0 504 0.2160 0.971 0.8468 0.9222 0.9202 4.9583
0.4771 43.0 516 0.2179 0.971 0.8468 0.9222 0.9202 4.9583
0.4771 44.0 528 0.2182 0.971 0.8468 0.9222 0.9202 4.9583
0.4771 45.0 540 0.2206 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 46.0 552 0.2172 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 47.0 564 0.2128 0.971 0.8468 0.9222 0.9202 4.9583
0.4771 48.0 576 0.2194 0.971 0.8468 0.9222 0.9202 4.9583
0.4771 49.0 588 0.2204 0.971 0.8468 0.9222 0.9202 4.9583
0.4771 50.0 600 0.2124 0.971 0.8468 0.9222 0.9202 4.9583
0.4771 51.0 612 0.2136 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 52.0 624 0.2119 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 53.0 636 0.2085 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 54.0 648 0.2115 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 55.0 660 0.2133 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 56.0 672 0.2087 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 57.0 684 0.2057 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 58.0 696 0.2095 0.9697 0.8445 0.9199 0.9179 4.9583
0.4771 59.0 708 0.2105 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 60.0 720 0.2123 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 61.0 732 0.2120 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 62.0 744 0.2132 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 63.0 756 0.2117 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 64.0 768 0.2068 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 65.0 780 0.2049 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 66.0 792 0.2054 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 67.0 804 0.2029 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 68.0 816 0.1995 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 69.0 828 0.1946 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 70.0 840 0.1975 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 71.0 852 0.1995 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 72.0 864 0.2009 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 73.0 876 0.2050 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 74.0 888 0.2039 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 75.0 900 0.2040 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 76.0 912 0.2020 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 77.0 924 0.2003 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 78.0 936 0.1992 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 79.0 948 0.1984 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 80.0 960 0.1971 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 81.0 972 0.1995 0.9675 0.8359 0.9136 0.9111 4.9792
0.4771 82.0 984 0.2007 0.9697 0.8445 0.9199 0.9179 4.9583
0.4771 83.0 996 0.2020 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 84.0 1008 0.2007 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 85.0 1020 0.1967 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 86.0 1032 0.1975 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 87.0 1044 0.1967 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 88.0 1056 0.1947 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 89.0 1068 0.1925 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 90.0 1080 0.1926 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 91.0 1092 0.1937 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 92.0 1104 0.1934 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 93.0 1116 0.1929 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 94.0 1128 0.1929 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 95.0 1140 0.1928 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 96.0 1152 0.1927 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 97.0 1164 0.1927 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 98.0 1176 0.1925 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 99.0 1188 0.1925 0.9697 0.8445 0.9199 0.9179 4.9583
0.113 100.0 1200 0.1923 0.9697 0.8445 0.9199 0.9179 4.9583

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
60.5M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from