Edit model card

t5-flan-semantic

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0553
  • Rouge1: 0.8952
  • Rouge2: 0.8673
  • Rougel: 0.8952
  • Rougelsum: 0.8952

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
0.0194 1.0 2 0.0082 0.9333 0.9158 0.9333 0.9333
0.0042 2.0 4 0.0408 0.9238 0.9031 0.9238 0.9238
0.0002 3.0 6 0.0647 0.9238 0.9031 0.9238 0.9238
0.0002 4.0 8 0.1117 0.9238 0.9031 0.9238 0.9238
0.0007 5.0 10 0.1404 0.9238 0.9031 0.9238 0.9238
0.0006 6.0 12 0.0987 0.9238 0.9031 0.9238 0.9238
0.0005 7.0 14 0.0587 0.9238 0.9031 0.9238 0.9238
0.0005 8.0 16 0.0251 0.9238 0.9031 0.9238 0.9238
0.0022 9.0 18 0.0128 0.9095 0.8852 0.9095 0.9095
0.0002 10.0 20 0.0228 0.8952 0.8622 0.8952 0.8952
0.0003 11.0 22 0.0351 0.8952 0.8622 0.8952 0.8952
0.0008 12.0 24 0.0374 0.9190 0.8980 0.9190 0.9190
0.0004 13.0 26 0.0462 0.8952 0.8673 0.8952 0.8952
0.0 14.0 28 0.0610 0.8952 0.8673 0.8952 0.8952
0.0001 15.0 30 0.0737 0.8952 0.8673 0.8952 0.8952
0.0005 16.0 32 0.0839 0.8952 0.8673 0.8952 0.8952
0.0002 17.0 34 0.0917 0.8952 0.8673 0.8952 0.8952
0.0009 18.0 36 0.1001 0.8952 0.8673 0.8952 0.8952
0.0005 19.0 38 0.1054 0.8952 0.8673 0.8952 0.8952
0.0012 20.0 40 0.1079 0.8952 0.8673 0.8952 0.8952
0.0 21.0 42 0.1085 0.8952 0.8673 0.8952 0.8952
0.0 22.0 44 0.1015 0.8952 0.8673 0.8952 0.8952
0.0018 23.0 46 0.0862 0.8952 0.8673 0.8952 0.8952
0.0001 24.0 48 0.0752 0.8952 0.8673 0.8952 0.8952
0.0004 25.0 50 0.0675 0.8952 0.8673 0.8952 0.8952
0.0001 26.0 52 0.0623 0.8952 0.8673 0.8952 0.8952
0.0 27.0 54 0.0589 0.8952 0.8673 0.8952 0.8952
0.0005 28.0 56 0.0568 0.8952 0.8673 0.8952 0.8952
0.0002 29.0 58 0.0557 0.8952 0.8673 0.8952 0.8952
0.0002 30.0 60 0.0553 0.8952 0.8673 0.8952 0.8952

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
5
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from