Edit model card

background-summaries-flan-t5-large

This model is a fine-tuned version of google/flan-t5-xl on the hf_dataset_script dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1489
  • Rouge1: 43.0
  • Rouge2: 20.2
  • Rougel: 28.9
  • Rougelsum: 39.5

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • total_train_batch_size: 16
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 45 1.7449 37.9 17.2 25.4 34.5
No log 2.0 90 1.7964 40.8 19.0 27.5 37.3
No log 3.0 135 1.8705 39.5 18.2 26.7 36.1
No log 4.0 180 1.9253 40.1 18.7 27.0 36.6
No log 5.0 225 1.9471 41.8 19.6 28.0 38.4
No log 6.0 270 2.0004 42.5 20.0 28.5 39.0
No log 7.0 315 1.9927 43.2 20.6 29.1 39.7
No log 8.0 360 2.0119 42.6 20.4 28.8 39.1
No log 9.0 405 2.0653 42.7 20.3 28.7 39.1
No log 10.0 450 2.1489 43.0 20.2 28.9 39.5

Framework versions

  • Transformers 4.27.4
  • Pytorch 2.0.0+cu118
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
0

Finetuned from

Dataset used to train Xmm/background-summaries-flan-t5-xl

Evaluation results