Edit model card

my_awesome_billsum_model_40

This model is a fine-tuned version of google-t5/t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1082
  • Rouge1: 0.9787
  • Rouge2: 0.8875
  • Rougel: 0.9329
  • Rougelsum: 0.9315
  • Gen Len: 5.2708

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 12 1.8920 0.4209 0.2805 0.384 0.3833 17.2292
No log 2.0 24 1.3065 0.4547 0.3125 0.4113 0.41 16.0
No log 3.0 36 0.8117 0.6973 0.546 0.6397 0.6374 10.3125
No log 4.0 48 0.6088 0.9492 0.7941 0.867 0.8609 5.1458
No log 5.0 60 0.5672 0.9513 0.797 0.8689 0.8631 5.125
No log 6.0 72 0.5178 0.9537 0.8052 0.8814 0.878 5.1458
No log 7.0 84 0.4737 0.9669 0.8387 0.9018 0.8988 5.1458
No log 8.0 96 0.4479 0.9709 0.8452 0.8972 0.8948 5.1667
No log 9.0 108 0.4178 0.9739 0.8595 0.9048 0.9038 5.1875
No log 10.0 120 0.3904 0.9739 0.8595 0.9048 0.9038 5.1875
No log 11.0 132 0.3681 0.9739 0.8595 0.9048 0.9038 5.1875
No log 12.0 144 0.3463 0.9769 0.8601 0.9066 0.9056 5.2083
No log 13.0 156 0.3295 0.9669 0.8253 0.887 0.8832 5.2917
No log 14.0 168 0.3124 0.9648 0.8236 0.8917 0.8885 5.3125
No log 15.0 180 0.3007 0.9648 0.8236 0.8917 0.8885 5.3125
No log 16.0 192 0.2976 0.9692 0.8346 0.8947 0.8908 5.2708
No log 17.0 204 0.2963 0.9671 0.833 0.8986 0.8952 5.2917
No log 18.0 216 0.2911 0.9671 0.833 0.8986 0.8952 5.2917
No log 19.0 228 0.2853 0.9717 0.8469 0.9028 0.9002 5.2917
No log 20.0 240 0.2782 0.9717 0.8469 0.9028 0.9002 5.2917
No log 21.0 252 0.2802 0.97 0.8462 0.9066 0.9043 5.3125
No log 22.0 264 0.2746 0.97 0.8462 0.9066 0.9043 5.3125
No log 23.0 276 0.2615 0.97 0.8462 0.9066 0.9043 5.3125
No log 24.0 288 0.2504 0.97 0.8462 0.9066 0.9043 5.3125
No log 25.0 300 0.2398 0.9656 0.8254 0.8946 0.8916 5.3333
No log 26.0 312 0.2301 0.9656 0.8254 0.8946 0.8916 5.3333
No log 27.0 324 0.2173 0.9656 0.8254 0.8946 0.8916 5.3333
No log 28.0 336 0.2109 0.9632 0.8237 0.8931 0.8899 5.3542
No log 29.0 348 0.2028 0.9632 0.8237 0.8931 0.8899 5.3542
No log 30.0 360 0.2016 0.9632 0.8237 0.8931 0.8899 5.3542
No log 31.0 372 0.1994 0.9632 0.8237 0.8931 0.8899 5.3542
No log 32.0 384 0.1986 0.9632 0.8237 0.8931 0.8899 5.3542
No log 33.0 396 0.1987 0.9632 0.8237 0.8931 0.8899 5.3542
No log 34.0 408 0.1965 0.9632 0.8237 0.8931 0.8899 5.3542
No log 35.0 420 0.1853 0.9632 0.8237 0.8931 0.8899 5.3542
No log 36.0 432 0.1841 0.9657 0.8368 0.9013 0.8982 5.3333
No log 37.0 444 0.1792 0.9657 0.8368 0.9013 0.8982 5.3333
No log 38.0 456 0.1778 0.9681 0.8379 0.8979 0.8954 5.3125
No log 39.0 468 0.1758 0.9657 0.8368 0.9013 0.8982 5.3333
No log 40.0 480 0.1778 0.9657 0.8368 0.9013 0.8982 5.3333
No log 41.0 492 0.1689 0.9638 0.8399 0.9064 0.904 5.3542
0.4636 42.0 504 0.1665 0.9638 0.8399 0.9064 0.904 5.3542
0.4636 43.0 516 0.1629 0.9657 0.8368 0.9013 0.8982 5.3333
0.4636 44.0 528 0.1616 0.9657 0.8472 0.9145 0.9109 5.3333
0.4636 45.0 540 0.1603 0.9681 0.8492 0.9112 0.9079 5.3125
0.4636 46.0 552 0.1592 0.9681 0.8492 0.9112 0.9079 5.3125
0.4636 47.0 564 0.1547 0.9681 0.8492 0.9112 0.9079 5.3125
0.4636 48.0 576 0.1500 0.9681 0.8492 0.9112 0.9079 5.3125
0.4636 49.0 588 0.1405 0.9681 0.8379 0.8979 0.8954 5.3125
0.4636 50.0 600 0.1316 0.9681 0.8379 0.8979 0.8954 5.3125
0.4636 51.0 612 0.1338 0.9681 0.8492 0.9112 0.9079 5.3125
0.4636 52.0 624 0.1351 0.9681 0.8492 0.9112 0.9079 5.3125
0.4636 53.0 636 0.1376 0.9681 0.8492 0.9112 0.9079 5.3125
0.4636 54.0 648 0.1349 0.9681 0.8492 0.9112 0.9079 5.3125
0.4636 55.0 660 0.1349 0.9681 0.8492 0.9112 0.9079 5.3125
0.4636 56.0 672 0.1319 0.9681 0.8492 0.9112 0.9079 5.3125
0.4636 57.0 684 0.1264 0.9681 0.8492 0.9112 0.9079 5.3125
0.4636 58.0 696 0.1223 0.9739 0.875 0.9282 0.926 5.2708
0.4636 59.0 708 0.1215 0.9739 0.875 0.9282 0.926 5.2708
0.4636 60.0 720 0.1233 0.9739 0.875 0.9282 0.926 5.2708
0.4636 61.0 732 0.1225 0.9739 0.875 0.9282 0.926 5.2708
0.4636 62.0 744 0.1201 0.9739 0.875 0.9282 0.926 5.2708
0.4636 63.0 756 0.1217 0.9739 0.875 0.9282 0.926 5.2708
0.4636 64.0 768 0.1220 0.9739 0.875 0.9282 0.926 5.2708
0.4636 65.0 780 0.1227 0.9739 0.875 0.9282 0.926 5.2708
0.4636 66.0 792 0.1215 0.9739 0.875 0.9282 0.926 5.2708
0.4636 67.0 804 0.1192 0.9787 0.8875 0.9329 0.9315 5.2708
0.4636 68.0 816 0.1171 0.9787 0.8875 0.9329 0.9315 5.2708
0.4636 69.0 828 0.1146 0.9821 0.8958 0.9424 0.9408 5.2917
0.4636 70.0 840 0.1129 0.9821 0.8958 0.9424 0.9408 5.2917
0.4636 71.0 852 0.1120 0.9821 0.8958 0.9424 0.9408 5.2917
0.4636 72.0 864 0.1098 0.9816 0.9101 0.9459 0.9455 5.2917
0.4636 73.0 876 0.1091 0.9722 0.8833 0.9304 0.9289 5.3125
0.4636 74.0 888 0.1086 0.9757 0.8976 0.9329 0.9325 5.3333
0.4636 75.0 900 0.1076 0.9816 0.9101 0.9459 0.9455 5.2917
0.4636 76.0 912 0.1080 0.9783 0.8958 0.9433 0.9419 5.2708
0.4636 77.0 924 0.1095 0.9821 0.8958 0.9424 0.9408 5.2917
0.4636 78.0 936 0.1112 0.9821 0.8958 0.9424 0.9408 5.2917
0.4636 79.0 948 0.1109 0.9821 0.8958 0.9424 0.9408 5.2917
0.4636 80.0 960 0.1101 0.9821 0.8958 0.9424 0.9408 5.2917
0.4636 81.0 972 0.1111 0.9821 0.8958 0.9424 0.9408 5.2917
0.4636 82.0 984 0.1102 0.9821 0.8958 0.9424 0.9408 5.2917
0.4636 83.0 996 0.1083 0.9821 0.911 0.9474 0.9464 5.2917
0.1189 84.0 1008 0.1084 0.9821 0.911 0.9474 0.9464 5.2917
0.1189 85.0 1020 0.1085 0.9851 0.9244 0.9502 0.9498 5.3125
0.1189 86.0 1032 0.1085 0.9816 0.9244 0.9508 0.9508 5.2917
0.1189 87.0 1044 0.1087 0.9816 0.9244 0.9508 0.9508 5.2917
0.1189 88.0 1056 0.1076 0.9816 0.9244 0.9508 0.9508 5.2917
0.1189 89.0 1068 0.1085 0.9788 0.9018 0.9364 0.9359 5.2708
0.1189 90.0 1080 0.1081 0.9823 0.9018 0.9359 0.9349 5.2917
0.1189 91.0 1092 0.1075 0.9788 0.9018 0.9364 0.9359 5.2708
0.1189 92.0 1104 0.1084 0.9823 0.9018 0.9359 0.9349 5.2917
0.1189 93.0 1116 0.1086 0.9823 0.9018 0.9359 0.9349 5.2917
0.1189 94.0 1128 0.1084 0.9787 0.8875 0.9329 0.9315 5.2708
0.1189 95.0 1140 0.1088 0.9787 0.8875 0.9329 0.9315 5.2708
0.1189 96.0 1152 0.1086 0.9787 0.8875 0.9329 0.9315 5.2708
0.1189 97.0 1164 0.1085 0.9787 0.8875 0.9329 0.9315 5.2708
0.1189 98.0 1176 0.1083 0.9787 0.8875 0.9329 0.9315 5.2708
0.1189 99.0 1188 0.1082 0.9787 0.8875 0.9329 0.9315 5.2708
0.1189 100.0 1200 0.1082 0.9787 0.8875 0.9329 0.9315 5.2708

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
1
Safetensors
Model size
60.5M params
Tensor type
F32
·
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from