Edit model card

flan-t5-base-samsum

This model is a fine-tuned version of google/flan-t5-base on the samsum dataset a collection of about 16k messenger-like conversations with summaries. Conversations were created and written down by linguists fluent in English. It achieves the following results on the evaluation set:

  • Loss: 1.3715
  • Rouge1: 47.339
  • Rouge2: 23.8991
  • Rougel: 40.0668
  • Rougelsum: 43.6669
  • Gen Len: 17.2063

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.4525 1.0 1842 1.3841 46.331 22.8895 39.0677 42.845 17.2149
1.3436 2.0 3684 1.3732 47.0523 23.5437 39.8169 43.449 17.1954
1.2821 3.0 5526 1.3717 47.2418 23.6518 39.774 43.5104 17.2295
1.2307 4.0 7368 1.3715 47.339 23.8991 40.0668 43.6669 17.2063
1.1985 5.0 9210 1.3770 47.4925 24.0124 40.128 43.8232 17.2833

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
0
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from