Edit model card

my_awesome_billsum_model_82

This model is a fine-tuned version of google-t5/t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2651
  • Rouge1: 0.9769
  • Rouge2: 0.8861
  • Rougel: 0.9414
  • Rougelsum: 0.9398
  • Gen Len: 4.9583

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 12 0.1788 0.9675 0.8359 0.9136 0.9111 4.9792
No log 2.0 24 0.1578 0.9706 0.8564 0.9219 0.9199 5.0
No log 3.0 36 0.1606 0.974 0.8654 0.9317 0.9307 4.9375
No log 4.0 48 0.1720 0.9769 0.8861 0.9414 0.9398 4.9583
No log 5.0 60 0.1800 0.9769 0.8861 0.9414 0.9398 4.9583
No log 6.0 72 0.1871 0.9769 0.8861 0.9414 0.9398 4.9583
No log 7.0 84 0.1840 0.974 0.8654 0.9317 0.9307 4.9375
No log 8.0 96 0.1802 0.9769 0.8861 0.9414 0.9398 4.9583
No log 9.0 108 0.1672 0.9769 0.8861 0.9414 0.9398 4.9583
No log 10.0 120 0.1875 0.9769 0.8861 0.9414 0.9398 4.9583
No log 11.0 132 0.2060 0.9728 0.8655 0.9285 0.927 4.9792
No log 12.0 144 0.2068 0.9728 0.8655 0.9285 0.927 4.9792
No log 13.0 156 0.2064 0.9769 0.8861 0.9414 0.9398 4.9583
No log 14.0 168 0.2066 0.9769 0.8861 0.9414 0.9398 4.9583
No log 15.0 180 0.1867 0.9769 0.8861 0.9414 0.9398 4.9583
No log 16.0 192 0.1947 0.974 0.8654 0.9317 0.9307 4.9375
No log 17.0 204 0.1979 0.9769 0.8861 0.9414 0.9398 4.9583
No log 18.0 216 0.1971 0.9769 0.8861 0.9414 0.9398 4.9583
No log 19.0 228 0.1865 0.9769 0.8861 0.9414 0.9398 4.9583
No log 20.0 240 0.1757 0.9769 0.8861 0.9414 0.9398 4.9583
No log 21.0 252 0.1735 0.9769 0.8861 0.9414 0.9398 4.9583
No log 22.0 264 0.1846 0.9769 0.8861 0.9414 0.9398 4.9583
No log 23.0 276 0.2039 0.9769 0.8861 0.9414 0.9398 4.9583
No log 24.0 288 0.2251 0.9769 0.8861 0.9414 0.9398 4.9583
No log 25.0 300 0.2272 0.9769 0.8861 0.9414 0.9398 4.9583
No log 26.0 312 0.2165 0.9769 0.8861 0.9414 0.9398 4.9583
No log 27.0 324 0.2202 0.9769 0.8861 0.9414 0.9398 4.9583
No log 28.0 336 0.2166 0.9769 0.8861 0.9414 0.9398 4.9583
No log 29.0 348 0.2151 0.9769 0.8861 0.9414 0.9398 4.9583
No log 30.0 360 0.2151 0.9769 0.8861 0.9414 0.9398 4.9583
No log 31.0 372 0.2136 0.9769 0.8861 0.9414 0.9398 4.9583
No log 32.0 384 0.2206 0.9769 0.8861 0.9414 0.9398 4.9583
No log 33.0 396 0.2233 0.9769 0.8861 0.9414 0.9398 4.9583
No log 34.0 408 0.2220 0.9769 0.8861 0.9414 0.9398 4.9583
No log 35.0 420 0.2263 0.9769 0.8861 0.9414 0.9398 4.9583
No log 36.0 432 0.2298 0.9769 0.8861 0.9414 0.9398 4.9583
No log 37.0 444 0.2413 0.9769 0.8861 0.9414 0.9398 4.9583
No log 38.0 456 0.2407 0.9769 0.8861 0.9414 0.9398 4.9583
No log 39.0 468 0.2407 0.9769 0.8861 0.9414 0.9398 4.9583
No log 40.0 480 0.2420 0.9769 0.8861 0.9414 0.9398 4.9583
No log 41.0 492 0.2424 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 42.0 504 0.2442 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 43.0 516 0.2466 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 44.0 528 0.2416 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 45.0 540 0.2442 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 46.0 552 0.2457 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 47.0 564 0.2383 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 48.0 576 0.2481 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 49.0 588 0.2512 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 50.0 600 0.2510 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 51.0 612 0.2516 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 52.0 624 0.2491 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 53.0 636 0.2480 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 54.0 648 0.2493 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 55.0 660 0.2417 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 56.0 672 0.2320 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 57.0 684 0.2270 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 58.0 696 0.2351 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 59.0 708 0.2414 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 60.0 720 0.2490 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 61.0 732 0.2489 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 62.0 744 0.2496 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 63.0 756 0.2505 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 64.0 768 0.2515 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 65.0 780 0.2511 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 66.0 792 0.2521 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 67.0 804 0.2530 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 68.0 816 0.2536 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 69.0 828 0.2535 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 70.0 840 0.2575 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 71.0 852 0.2593 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 72.0 864 0.2588 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 73.0 876 0.2654 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 74.0 888 0.2622 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 75.0 900 0.2597 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 76.0 912 0.2586 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 77.0 924 0.2566 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 78.0 936 0.2554 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 79.0 948 0.2560 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 80.0 960 0.2582 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 81.0 972 0.2614 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 82.0 984 0.2652 0.9769 0.8861 0.9414 0.9398 4.9583
0.0483 83.0 996 0.2685 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 84.0 1008 0.2696 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 85.0 1020 0.2700 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 86.0 1032 0.2715 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 87.0 1044 0.2697 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 88.0 1056 0.2692 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 89.0 1068 0.2666 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 90.0 1080 0.2666 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 91.0 1092 0.2671 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 92.0 1104 0.2665 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 93.0 1116 0.2655 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 94.0 1128 0.2646 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 95.0 1140 0.2652 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 96.0 1152 0.2656 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 97.0 1164 0.2657 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 98.0 1176 0.2656 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 99.0 1188 0.2654 0.9769 0.8861 0.9414 0.9398 4.9583
0.0231 100.0 1200 0.2651 0.9769 0.8861 0.9414 0.9398 4.9583

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
3
Safetensors
Model size
60.5M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from