Edit model card

dialogsum

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0245
  • Rouge1: 45.6581
  • Rouge2: 15.9871
  • Rougel: 43.5188
  • Rougelsum: 43.913
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.2634 1.0 786 1.1040 42.2483 13.4135 40.3622 40.5436 19.0
1.1931 2.0 1572 1.0637 41.6604 14.0866 39.7604 39.9971 19.0
1.1347 3.0 2358 1.0421 45.0247 14.5438 42.8164 43.1956 19.0
1.1155 4.0 3144 1.0312 45.6977 16.2231 43.5457 43.8965 19.0
1.0827 5.0 3930 1.0245 45.6581 15.9871 43.5188 43.913 19.0

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
0
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from