Edit model card

flan-t5-large-samsum

This model is a fine-tuned version of google/flan-t5-large on the samsum dataset.

It achieves the following results on the evaluation set:

  • Loss: 1.1754
  • Rouge1: 54.1595
  • Rouge2: 29.1081
  • Rougel: 45.4989
  • Rougelsum: 49.1026
  • Gen Len: 28.72

Note: the stacked version of this model technically does evaluation on a different validation set (the stacked one) while this just uses samsum.

Model description

More information needed

Intended uses & limitations

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 17868
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.04
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.2106 0.43 50 1.1889 52.5898 26.9967 43.6944 47.9656 24.5167
1.213 0.87 100 1.1760 52.4279 27.4689 43.7873 48.0581 25.0533
1.0726 1.3 150 1.1731 52.8246 26.9524 43.7429 48.0345 25.55
1.0784 1.74 200 1.1708 53.1291 27.9056 44.2609 48.6883 26.03
1.0215 2.17 250 1.1755 53.1512 27.9475 44.1442 48.4619 27.57
1.0294 2.61 300 1.1711 53.4402 28.0126 44.5454 48.6432 25.9033
1.0016 3.04 350 1.1718 53.9395 28.3087 45.191 49.2773 26.6133
0.9576 3.48 400 1.1741 53.9004 28.3243 45.0911 48.9182 26.33
0.9739 3.91 450 1.1754 53.7049 28.419 44.8946 48.8708 27.2433
0.9505 4.35 500 1.1781 53.7142 28.1758 44.8324 48.9498 26.8667
0.9993 4.78 550 1.1784 53.87 28.2211 44.893 49.1074 27.2167
Downloads last month
7
Safetensors
Model size
783M params
Tensor type
F32
·

Finetuned from

Dataset used to train stacked-summaries/flan-t5-large-samsum

Evaluation results