Edit model card

flan-t5-base-tldr-100k

This model is a fine-tuned version of google/flan-t5-base on the first 100.000 samples of the Reddit TL;DR dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7323
  • Rouge1: 17.0772
  • Rouge2: 4.4204
  • Rougel: 14.549
  • Rougelsum: 15.0148
  • Gen Len: 16.0925

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Gen Len Validation Loss Rouge1 Rouge2 Rougel Rougelsum
3.1467 1.0 11250 16.4911 2.9531 15.909 3.8156 13.4884 13.9568
3.0673 2.0 22500 16.4639 2.9318 16.4972 3.9757 13.9517 14.4336
2.9952 3.0 33750 16.2585 2.9245 16.5997 4.1068 14.0299 14.5147
2.9524 4.0 45000 2.7323 17.0772 4.4204 14.549 15.0148 16.0925
2.9223 5.0 56250 2.7328 17.1468 4.4384 14.5798 15.0572 16.2163

Framework versions

  • Transformers 4.27.4
  • Pytorch 2.0.0+cu117
  • Datasets 2.11.0
  • Tokenizers 0.13.2
Downloads last month
7