Edit model card

ft-t5-with-dill-sum

This model is a fine-tuned version of t5-small on the billsum dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3109
  • Rouge1: 0.1886
  • Rouge2: 0.104
  • Rougel: 0.166
  • Rougelsum: 0.1659
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.5462 1.0 31 2.4185 0.187 0.1023 0.1637 0.1639 19.0
2.5478 2.0 62 2.4166 0.187 0.1018 0.1637 0.1639 19.0
2.5729 3.0 93 2.4114 0.1868 0.1015 0.1637 0.1638 19.0
2.5806 4.0 124 2.4072 0.1855 0.1006 0.1626 0.1627 19.0
2.5231 5.0 155 2.4025 0.1877 0.1042 0.165 0.165 19.0
2.5245 6.0 186 2.3948 0.1869 0.1024 0.1642 0.1642 19.0
2.5273 7.0 217 2.3860 0.1886 0.1032 0.1652 0.1653 19.0
2.4941 8.0 248 2.3765 0.188 0.1033 0.1649 0.165 19.0
2.4612 9.0 279 2.3698 0.19 0.1057 0.1671 0.1671 19.0
2.463 10.0 310 2.3578 0.1882 0.1039 0.1662 0.1663 19.0
2.4539 11.0 341 2.3491 0.1898 0.1057 0.1667 0.1667 19.0
2.441 12.0 372 2.3392 0.1901 0.1055 0.1669 0.1668 19.0
2.4389 13.0 403 2.3292 0.1893 0.1053 0.1666 0.1665 19.0
2.3945 14.0 434 2.3203 0.1903 0.1051 0.1676 0.1675 19.0
2.4148 15.0 465 2.3109 0.1886 0.104 0.166 0.1659 19.0

Framework versions

  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
60.5M params
Tensor type
F32
·

Finetuned from

Evaluation results