Edit model card

t5-summarization-zero-shot-headers-and-better-prompt

This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2226
  • Rouge: {'rouge1': 0.4351, 'rouge2': 0.2124, 'rougeL': 0.215, 'rougeLsum': 0.215}
  • Bert Score: 0.8806
  • Bleurt 20: -0.7502
  • Gen Len: 14.645

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 7
  • eval_batch_size: 7
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge Bert Score Bleurt 20 Gen Len
3.0683 1.0 186 2.5857 {'rouge1': 0.4573, 'rouge2': 0.1803, 'rougeL': 0.1858, 'rougeLsum': 0.1858} 0.8683 -0.8521 15.445
2.7283 2.0 372 2.4092 {'rouge1': 0.446, 'rouge2': 0.1853, 'rougeL': 0.1969, 'rougeLsum': 0.1969} 0.8709 -0.828 15.115
2.4766 3.0 558 2.3190 {'rouge1': 0.4183, 'rouge2': 0.1834, 'rougeL': 0.1947, 'rougeLsum': 0.1947} 0.869 -0.8673 14.425
2.351 4.0 744 2.2736 {'rouge1': 0.4264, 'rouge2': 0.1843, 'rougeL': 0.1919, 'rougeLsum': 0.1919} 0.8693 -0.8411 15.205
2.287 5.0 930 2.2440 {'rouge1': 0.42, 'rouge2': 0.1924, 'rougeL': 0.1991, 'rougeLsum': 0.1991} 0.875 -0.8358 14.305
2.1426 6.0 1116 2.2100 {'rouge1': 0.4196, 'rouge2': 0.1903, 'rougeL': 0.2027, 'rougeLsum': 0.2027} 0.8779 -0.8189 14.38
2.0381 7.0 1302 2.2171 {'rouge1': 0.459, 'rouge2': 0.2143, 'rougeL': 0.2142, 'rougeLsum': 0.2142} 0.8772 -0.7757 14.825
1.9927 8.0 1488 2.2106 {'rouge1': 0.44, 'rouge2': 0.2073, 'rougeL': 0.2132, 'rougeLsum': 0.2132} 0.8795 -0.7798 14.53
1.9347 9.0 1674 2.1976 {'rouge1': 0.4289, 'rouge2': 0.2062, 'rougeL': 0.2122, 'rougeLsum': 0.2122} 0.88 -0.7774 14.14
1.8733 10.0 1860 2.1987 {'rouge1': 0.4472, 'rouge2': 0.215, 'rougeL': 0.2124, 'rougeLsum': 0.2124} 0.8791 -0.7688 14.49
1.7883 11.0 2046 2.1963 {'rouge1': 0.4375, 'rouge2': 0.2114, 'rougeL': 0.2064, 'rougeLsum': 0.2064} 0.8786 -0.785 14.66
1.8253 12.0 2232 2.2055 {'rouge1': 0.4351, 'rouge2': 0.2073, 'rougeL': 0.2106, 'rougeLsum': 0.2106} 0.8803 -0.7759 14.59
1.7751 13.0 2418 2.2029 {'rouge1': 0.4371, 'rouge2': 0.2125, 'rougeL': 0.2119, 'rougeLsum': 0.2119} 0.8796 -0.7711 14.7
1.7087 14.0 2604 2.2073 {'rouge1': 0.448, 'rouge2': 0.2211, 'rougeL': 0.2176, 'rougeLsum': 0.2176} 0.8806 -0.7492 14.695
1.7034 15.0 2790 2.2150 {'rouge1': 0.4381, 'rouge2': 0.214, 'rougeL': 0.2158, 'rougeLsum': 0.2158} 0.8809 -0.7611 14.555
1.6671 16.0 2976 2.2211 {'rouge1': 0.4388, 'rouge2': 0.2162, 'rougeL': 0.2169, 'rougeLsum': 0.2169} 0.8797 -0.7532 14.73
1.6964 17.0 3162 2.2207 {'rouge1': 0.4316, 'rouge2': 0.2117, 'rougeL': 0.2137, 'rougeLsum': 0.2137} 0.8799 -0.7729 14.54
1.6556 18.0 3348 2.2183 {'rouge1': 0.4379, 'rouge2': 0.2122, 'rougeL': 0.2163, 'rougeLsum': 0.2163} 0.8804 -0.7475 14.735
1.6391 19.0 3534 2.2200 {'rouge1': 0.4332, 'rouge2': 0.2105, 'rougeL': 0.2149, 'rougeLsum': 0.2149} 0.8805 -0.7521 14.635
1.6309 20.0 3720 2.2226 {'rouge1': 0.4351, 'rouge2': 0.2124, 'rougeL': 0.215, 'rougeLsum': 0.215} 0.8806 -0.7502 14.645

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
4
Safetensors
Model size
77M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for veronica-girolimetti/t5-summarization-zero-shot-headers-and-better-prompt

Finetuned
(289)
this model