Edit model card

my_awesome_billsum_model_24

This model is a fine-tuned version of google-t5/t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1106
  • Rouge1: 0.997
  • Rouge2: 0.9736
  • Rougel: 0.9807
  • Rougelsum: 0.9807
  • Gen Len: 5.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 12 0.1051 0.9929 0.9646 0.9765 0.978 4.9792
No log 2.0 24 0.1272 0.9869 0.9319 0.9586 0.96 4.9792
No log 3.0 36 0.1472 0.9892 0.9458 0.9669 0.9684 5.0417
No log 4.0 48 0.1401 0.9892 0.9458 0.9669 0.9684 5.0417
No log 5.0 60 0.1206 0.9922 0.9655 0.9758 0.9773 5.0625
No log 6.0 72 0.1185 0.9922 0.9655 0.9758 0.9773 5.0625
No log 7.0 84 0.1177 0.9922 0.9655 0.9758 0.9773 5.0625
No log 8.0 96 0.1223 0.9922 0.9655 0.9758 0.9773 5.0625
No log 9.0 108 0.1253 0.9922 0.9655 0.9758 0.9773 5.0625
No log 10.0 120 0.1257 0.9892 0.9458 0.9669 0.9684 5.0417
No log 11.0 132 0.1289 0.9899 0.9444 0.9676 0.969 4.9583
No log 12.0 144 0.1164 0.9899 0.9444 0.9676 0.969 4.9583
No log 13.0 156 0.1188 0.9911 0.9521 0.9688 0.969 5.0
No log 14.0 168 0.1235 0.9929 0.9646 0.9765 0.978 4.9792
No log 15.0 180 0.1323 0.9899 0.9444 0.9676 0.969 4.9583
No log 16.0 192 0.1341 0.9899 0.9444 0.9676 0.969 4.9583
No log 17.0 204 0.1331 0.9899 0.9444 0.9676 0.969 4.9583
No log 18.0 216 0.1169 0.9929 0.9646 0.9765 0.978 4.9792
No log 19.0 228 0.1169 0.9929 0.9646 0.9765 0.978 4.9792
No log 20.0 240 0.1162 0.9929 0.9646 0.9765 0.978 4.9792
No log 21.0 252 0.1200 0.9929 0.9646 0.9765 0.978 4.9792
No log 22.0 264 0.1176 0.9947 0.9661 0.9792 0.9797 4.9792
No log 23.0 276 0.1110 0.997 0.9736 0.9807 0.9807 5.0
No log 24.0 288 0.1146 0.997 0.9736 0.9807 0.9807 5.0
No log 25.0 300 0.1101 0.997 0.9736 0.9807 0.9807 5.0
No log 26.0 312 0.1064 0.997 0.9736 0.9807 0.9807 5.0
No log 27.0 324 0.1059 0.994 0.9625 0.9717 0.9717 5.0208
No log 28.0 336 0.1064 0.994 0.9625 0.9717 0.9717 5.0208
No log 29.0 348 0.1047 0.994 0.9625 0.9717 0.9717 5.0208
No log 30.0 360 0.1005 0.994 0.9625 0.9717 0.9717 5.0208
No log 31.0 372 0.0986 0.994 0.9625 0.9717 0.9717 5.0208
No log 32.0 384 0.0981 0.994 0.9625 0.9717 0.9717 5.0208
No log 33.0 396 0.0989 0.994 0.9625 0.9717 0.9717 5.0208
No log 34.0 408 0.1026 0.997 0.9736 0.9807 0.9807 5.0
No log 35.0 420 0.1036 0.997 0.9736 0.9807 0.9807 5.0
No log 36.0 432 0.1033 0.997 0.9736 0.9807 0.9807 5.0
No log 37.0 444 0.0995 0.997 0.9736 0.9807 0.9807 5.0
No log 38.0 456 0.0977 0.997 0.9736 0.9807 0.9807 5.0
No log 39.0 468 0.0949 0.994 0.9625 0.9717 0.9717 5.0208
No log 40.0 480 0.0926 0.9911 0.9521 0.9688 0.969 5.0
No log 41.0 492 0.0893 0.9911 0.9521 0.9688 0.969 5.0
0.0105 42.0 504 0.0871 0.9911 0.9521 0.9688 0.969 5.0
0.0105 43.0 516 0.0863 0.9911 0.9521 0.9688 0.969 5.0
0.0105 44.0 528 0.0915 0.9911 0.9521 0.9688 0.969 5.0
0.0105 45.0 540 0.0937 0.994 0.9625 0.9717 0.9717 5.0208
0.0105 46.0 552 0.0950 0.994 0.9625 0.9717 0.9717 5.0208
0.0105 47.0 564 0.0955 0.994 0.9625 0.9717 0.9717 5.0208
0.0105 48.0 576 0.0956 0.994 0.9625 0.9717 0.9717 5.0208
0.0105 49.0 588 0.0968 0.997 0.9736 0.9807 0.9807 5.0
0.0105 50.0 600 0.0986 0.997 0.9736 0.9807 0.9807 5.0
0.0105 51.0 612 0.1001 0.997 0.9736 0.9807 0.9807 5.0
0.0105 52.0 624 0.0995 0.997 0.9736 0.9807 0.9807 5.0
0.0105 53.0 636 0.0983 0.997 0.9736 0.9807 0.9807 5.0
0.0105 54.0 648 0.0995 0.997 0.9736 0.9807 0.9807 5.0
0.0105 55.0 660 0.1024 0.997 0.9736 0.9807 0.9807 5.0
0.0105 56.0 672 0.1040 0.997 0.9736 0.9807 0.9807 5.0
0.0105 57.0 684 0.1052 0.997 0.9736 0.9807 0.9807 5.0
0.0105 58.0 696 0.1055 0.997 0.9736 0.9807 0.9807 5.0
0.0105 59.0 708 0.1061 0.997 0.9736 0.9807 0.9807 5.0
0.0105 60.0 720 0.1053 0.997 0.9736 0.9807 0.9807 5.0
0.0105 61.0 732 0.1078 0.997 0.9736 0.9807 0.9807 5.0
0.0105 62.0 744 0.1087 0.997 0.9736 0.9807 0.9807 5.0
0.0105 63.0 756 0.1074 0.997 0.9736 0.9807 0.9807 5.0
0.0105 64.0 768 0.1039 0.997 0.9736 0.9807 0.9807 5.0
0.0105 65.0 780 0.1022 0.997 0.9736 0.9807 0.9807 5.0
0.0105 66.0 792 0.1017 0.997 0.9736 0.9807 0.9807 5.0
0.0105 67.0 804 0.1026 0.997 0.9736 0.9807 0.9807 5.0
0.0105 68.0 816 0.1050 0.997 0.9736 0.9807 0.9807 5.0
0.0105 69.0 828 0.1060 0.997 0.9736 0.9807 0.9807 5.0
0.0105 70.0 840 0.1069 0.997 0.9736 0.9807 0.9807 5.0
0.0105 71.0 852 0.1070 0.997 0.9736 0.9807 0.9807 5.0
0.0105 72.0 864 0.1048 0.997 0.9736 0.9807 0.9807 5.0
0.0105 73.0 876 0.1041 0.997 0.9736 0.9807 0.9807 5.0
0.0105 74.0 888 0.1039 0.997 0.9736 0.9807 0.9807 5.0
0.0105 75.0 900 0.1042 0.997 0.9736 0.9807 0.9807 5.0
0.0105 76.0 912 0.1056 0.997 0.9736 0.9807 0.9807 5.0
0.0105 77.0 924 0.1057 0.997 0.9736 0.9807 0.9807 5.0
0.0105 78.0 936 0.1058 0.997 0.9736 0.9807 0.9807 5.0
0.0105 79.0 948 0.1062 0.997 0.9736 0.9807 0.9807 5.0
0.0105 80.0 960 0.1072 0.997 0.9736 0.9807 0.9807 5.0
0.0105 81.0 972 0.1070 0.997 0.9736 0.9807 0.9807 5.0
0.0105 82.0 984 0.1068 0.997 0.9736 0.9807 0.9807 5.0
0.0105 83.0 996 0.1064 0.997 0.9736 0.9807 0.9807 5.0
0.0053 84.0 1008 0.1078 0.997 0.9736 0.9807 0.9807 5.0
0.0053 85.0 1020 0.1077 0.997 0.9736 0.9807 0.9807 5.0
0.0053 86.0 1032 0.1086 0.997 0.9736 0.9807 0.9807 5.0
0.0053 87.0 1044 0.1087 0.997 0.9736 0.9807 0.9807 5.0
0.0053 88.0 1056 0.1088 0.997 0.9736 0.9807 0.9807 5.0
0.0053 89.0 1068 0.1081 0.997 0.9736 0.9807 0.9807 5.0
0.0053 90.0 1080 0.1081 0.997 0.9736 0.9807 0.9807 5.0
0.0053 91.0 1092 0.1085 0.997 0.9736 0.9807 0.9807 5.0
0.0053 92.0 1104 0.1089 0.997 0.9736 0.9807 0.9807 5.0
0.0053 93.0 1116 0.1093 0.997 0.9736 0.9807 0.9807 5.0
0.0053 94.0 1128 0.1098 0.997 0.9736 0.9807 0.9807 5.0
0.0053 95.0 1140 0.1102 0.997 0.9736 0.9807 0.9807 5.0
0.0053 96.0 1152 0.1106 0.997 0.9736 0.9807 0.9807 5.0
0.0053 97.0 1164 0.1108 0.997 0.9736 0.9807 0.9807 5.0
0.0053 98.0 1176 0.1109 0.997 0.9736 0.9807 0.9807 5.0
0.0053 99.0 1188 0.1107 0.997 0.9736 0.9807 0.9807 5.0
0.0053 100.0 1200 0.1106 0.997 0.9736 0.9807 0.9807 5.0

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
60.5M params
Tensor type
F32
·

Finetuned from