t5-flan-semantic / README.md
devagonal's picture
End of training
fb941b7 verified
metadata
license: apache-2.0
base_model: google/flan-t5-base
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: t5-flan-semantic
    results: []

t5-flan-semantic

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0553
  • Rouge1: 0.8952
  • Rouge2: 0.8673
  • Rougel: 0.8952
  • Rougelsum: 0.8952

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
0.0194 1.0 2 0.0082 0.9333 0.9158 0.9333 0.9333
0.0042 2.0 4 0.0408 0.9238 0.9031 0.9238 0.9238
0.0002 3.0 6 0.0647 0.9238 0.9031 0.9238 0.9238
0.0002 4.0 8 0.1117 0.9238 0.9031 0.9238 0.9238
0.0007 5.0 10 0.1404 0.9238 0.9031 0.9238 0.9238
0.0006 6.0 12 0.0987 0.9238 0.9031 0.9238 0.9238
0.0005 7.0 14 0.0587 0.9238 0.9031 0.9238 0.9238
0.0005 8.0 16 0.0251 0.9238 0.9031 0.9238 0.9238
0.0022 9.0 18 0.0128 0.9095 0.8852 0.9095 0.9095
0.0002 10.0 20 0.0228 0.8952 0.8622 0.8952 0.8952
0.0003 11.0 22 0.0351 0.8952 0.8622 0.8952 0.8952
0.0008 12.0 24 0.0374 0.9190 0.8980 0.9190 0.9190
0.0004 13.0 26 0.0462 0.8952 0.8673 0.8952 0.8952
0.0 14.0 28 0.0610 0.8952 0.8673 0.8952 0.8952
0.0001 15.0 30 0.0737 0.8952 0.8673 0.8952 0.8952
0.0005 16.0 32 0.0839 0.8952 0.8673 0.8952 0.8952
0.0002 17.0 34 0.0917 0.8952 0.8673 0.8952 0.8952
0.0009 18.0 36 0.1001 0.8952 0.8673 0.8952 0.8952
0.0005 19.0 38 0.1054 0.8952 0.8673 0.8952 0.8952
0.0012 20.0 40 0.1079 0.8952 0.8673 0.8952 0.8952
0.0 21.0 42 0.1085 0.8952 0.8673 0.8952 0.8952
0.0 22.0 44 0.1015 0.8952 0.8673 0.8952 0.8952
0.0018 23.0 46 0.0862 0.8952 0.8673 0.8952 0.8952
0.0001 24.0 48 0.0752 0.8952 0.8673 0.8952 0.8952
0.0004 25.0 50 0.0675 0.8952 0.8673 0.8952 0.8952
0.0001 26.0 52 0.0623 0.8952 0.8673 0.8952 0.8952
0.0 27.0 54 0.0589 0.8952 0.8673 0.8952 0.8952
0.0005 28.0 56 0.0568 0.8952 0.8673 0.8952 0.8952
0.0002 29.0 58 0.0557 0.8952 0.8673 0.8952 0.8952
0.0002 30.0 60 0.0553 0.8952 0.8673 0.8952 0.8952

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2