Edit model card

my_awesome_billsum_model_78

This model is a fine-tuned version of google-t5/t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5080
  • Rouge1: 0.9792
  • Rouge2: 0.8868
  • Rougel: 0.9405
  • Rougelsum: 0.94
  • Gen Len: 4.9792

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 12 0.4089 0.9821 0.9104 0.9484 0.9484 4.9583
No log 2.0 24 0.4068 0.9821 0.9104 0.9484 0.9484 4.9583
No log 3.0 36 0.4284 0.9792 0.8868 0.9405 0.94 4.9792
No log 4.0 48 0.4548 0.9792 0.8903 0.9395 0.9405 5.0208
No log 5.0 60 0.4590 0.9792 0.8903 0.9395 0.9405 5.0208
No log 6.0 72 0.4543 0.9792 0.8903 0.9395 0.9405 5.0208
No log 7.0 84 0.4863 0.9752 0.8708 0.9311 0.9311 5.0417
No log 8.0 96 0.4935 0.9732 0.8569 0.9221 0.9216 5.0208
No log 9.0 108 0.4931 0.9762 0.8691 0.9311 0.9311 5.0
No log 10.0 120 0.4817 0.9762 0.8691 0.9311 0.9311 5.0
No log 11.0 132 0.4741 0.9762 0.8691 0.9311 0.9311 5.0
No log 12.0 144 0.4732 0.9762 0.8691 0.9311 0.9311 5.0
No log 13.0 156 0.4742 0.9762 0.8691 0.9311 0.9311 5.0
No log 14.0 168 0.4736 0.9792 0.8903 0.9395 0.9405 5.0208
No log 15.0 180 0.4680 0.9792 0.8903 0.9395 0.9405 5.0208
No log 16.0 192 0.4534 0.9821 0.9007 0.9479 0.9494 5.0
No log 17.0 204 0.4412 0.9821 0.9007 0.9479 0.9494 5.0
No log 18.0 216 0.4341 0.9821 0.9007 0.9479 0.9494 5.0
No log 19.0 228 0.4317 0.9821 0.9007 0.9479 0.9494 5.0
No log 20.0 240 0.4315 0.9821 0.9007 0.9479 0.9494 5.0
No log 21.0 252 0.4313 0.9792 0.8903 0.9395 0.9405 5.0208
No log 22.0 264 0.4277 0.9792 0.8868 0.9405 0.94 4.9792
No log 23.0 276 0.4376 0.9792 0.8868 0.9405 0.94 4.9792
No log 24.0 288 0.4432 0.9792 0.8868 0.9405 0.94 4.9792
No log 25.0 300 0.4450 0.9792 0.8868 0.9405 0.94 4.9792
No log 26.0 312 0.4468 0.9792 0.8868 0.9405 0.94 4.9792
No log 27.0 324 0.4415 0.9792 0.8868 0.9405 0.94 4.9792
No log 28.0 336 0.4560 0.9792 0.8868 0.9405 0.94 4.9792
No log 29.0 348 0.4713 0.9792 0.8868 0.9405 0.94 4.9792
No log 30.0 360 0.4732 0.9792 0.8868 0.9405 0.94 4.9792
No log 31.0 372 0.4726 0.9792 0.8868 0.9405 0.94 4.9792
No log 32.0 384 0.4682 0.9792 0.8868 0.9405 0.94 4.9792
No log 33.0 396 0.4647 0.9792 0.8868 0.9405 0.94 4.9792
No log 34.0 408 0.4644 0.9792 0.8868 0.9405 0.94 4.9792
No log 35.0 420 0.4657 0.9821 0.9007 0.9479 0.9494 5.0
No log 36.0 432 0.4643 0.9821 0.9007 0.9479 0.9494 5.0
No log 37.0 444 0.4572 0.9821 0.9007 0.9479 0.9494 5.0
No log 38.0 456 0.4447 0.9821 0.9007 0.9479 0.9494 5.0
No log 39.0 468 0.4437 0.9821 0.9007 0.9479 0.9494 5.0
No log 40.0 480 0.4684 0.9821 0.9007 0.9479 0.9494 5.0
No log 41.0 492 0.4722 0.9821 0.9007 0.9479 0.9494 5.0
0.0088 42.0 504 0.4716 0.9821 0.9007 0.9479 0.9494 5.0
0.0088 43.0 516 0.4803 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 44.0 528 0.4854 0.9762 0.8691 0.9311 0.9311 5.0
0.0088 45.0 540 0.4830 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 46.0 552 0.4819 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 47.0 564 0.4812 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 48.0 576 0.4806 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 49.0 588 0.4762 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 50.0 600 0.4737 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 51.0 612 0.4735 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 52.0 624 0.4738 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 53.0 636 0.4736 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 54.0 648 0.4738 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 55.0 660 0.4776 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 56.0 672 0.4866 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 57.0 684 0.4926 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 58.0 696 0.4938 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 59.0 708 0.4902 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 60.0 720 0.4962 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 61.0 732 0.5033 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 62.0 744 0.5043 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 63.0 756 0.5025 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 64.0 768 0.5176 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 65.0 780 0.5708 0.9762 0.8691 0.9311 0.9311 5.0
0.0088 66.0 792 0.5707 0.9762 0.8691 0.9311 0.9311 5.0
0.0088 67.0 804 0.5278 0.9762 0.8691 0.9311 0.9311 5.0
0.0088 68.0 816 0.5179 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 69.0 828 0.5164 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 70.0 840 0.5504 0.9762 0.8691 0.9311 0.9311 5.0
0.0088 71.0 852 0.5584 0.9762 0.8691 0.9311 0.9311 5.0
0.0088 72.0 864 0.5281 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 73.0 876 0.5198 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 74.0 888 0.5176 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 75.0 900 0.5103 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 76.0 912 0.5068 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 77.0 924 0.5030 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 78.0 936 0.5025 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 79.0 948 0.4968 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 80.0 960 0.5113 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 81.0 972 0.5083 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 82.0 984 0.5031 0.9792 0.8868 0.9405 0.94 4.9792
0.0088 83.0 996 0.5066 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 84.0 1008 0.5177 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 85.0 1020 0.5192 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 86.0 1032 0.5104 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 87.0 1044 0.5085 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 88.0 1056 0.5130 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 89.0 1068 0.5116 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 90.0 1080 0.5081 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 91.0 1092 0.5074 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 92.0 1104 0.5090 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 93.0 1116 0.5097 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 94.0 1128 0.5123 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 95.0 1140 0.5118 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 96.0 1152 0.5089 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 97.0 1164 0.5080 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 98.0 1176 0.5079 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 99.0 1188 0.5076 0.9792 0.8868 0.9405 0.94 4.9792
0.0059 100.0 1200 0.5080 0.9792 0.8868 0.9405 0.94 4.9792

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
60.5M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from