Edit model card

flan-t5-base-AR-LORA-V1

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7887
  • Exact Match: 28.3
  • Gen Len: 3.592

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Exact Match Gen Len
1.1717 1.0 625 0.9465 18.9 3.82
0.8167 2.0 1250 0.8975 17.9 3.923
0.9046 3.0 1875 0.8691 25.4 3.338
0.9501 4.0 2500 0.8624 17.8 3.978
0.884 5.0 3125 0.8469 19.9 3.917
0.8418 6.0 3750 0.8356 24.8 3.596
0.877 7.0 4375 0.8261 19.0 3.926
0.804 8.0 5000 0.8147 23.0 3.732
0.8267 9.0 5625 0.8123 26.0 3.629
0.8979 10.0 6250 0.8132 24.5 3.685
0.8165 11.0 6875 0.8084 28.4 3.517
0.891 12.0 7500 0.8034 28.1 3.548
0.768 13.0 8125 0.8095 29.1 3.45
0.6895 14.0 8750 0.8018 27.7 3.553
0.7796 15.0 9375 0.7996 30.1 3.49
0.787 16.0 10000 0.8013 26.0 3.665
0.811 17.0 10625 0.7979 28.5 3.563
0.7858 18.0 11250 0.7991 26.4 3.64
0.8608 19.0 11875 0.7955 24.8 3.733
0.9044 20.0 12500 0.7913 25.9 3.662
0.9171 21.0 13125 0.7905 25.9 3.708
0.8093 22.0 13750 0.7918 28.1 3.596
0.7653 23.0 14375 0.7940 28.3 3.586
0.9361 24.0 15000 0.7887 28.3 3.592
0.6999 25.0 15625 0.7921 29.6 3.552
0.728 26.0 16250 0.7918 27.8 3.621
0.7169 27.0 16875 0.7908 27.2 3.628
0.6388 28.0 17500 0.7920 28.9 3.572
0.7302 29.0 18125 0.7920 28.8 3.573
0.7651 30.0 18750 0.7917 28.0 3.599

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.2.1
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
18
Unable to determine this model’s pipeline type. Check the docs .

Adapter for