Edit model card

oop-de-qg-flan-t5-base-v4

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8388
  • Rouge1: 59.6054
  • Rouge2: 46.5045
  • Rougel: 58.1566
  • Rougelsum: 58.1981
  • Gen Len: 14.5287
  • Bleu: 0.3568
  • Precisions: [0.6571719226856562, 0.4774637127578304, 0.3935286401399213, 0.32975460122699385]
  • Brevity Penalty: 0.7943
  • Length Ratio: 0.8128
  • Translation Length: 2949
  • Reference Length: 3628

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 10
  • eval_batch_size: 10
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 40
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length
No log 1.0 58 0.9821 56.9689 42.8079 55.2974 55.3949 14.6133 0.3163 [0.6255566974991436, 0.43508500772797526, 0.34603455914931325, 0.2808930425752856] 0.7844 0.8046 2919 3628
No log 1.99 116 0.9155 57.2587 43.5697 55.6919 55.8245 14.4018 0.3208 [0.629286694101509, 0.43945841392649904, 0.35226264418811004, 0.28861154446177845] 0.7834 0.8037 2916 3628
No log 2.99 174 0.8738 58.2245 44.9681 56.8015 56.9123 14.2719 0.3356 [0.6483021483021483, 0.461839530332681, 0.37634892086330934, 0.31484416270470156] 0.7733 0.7955 2886 3628
No log 4.0 233 0.8636 59.608 46.4603 58.1054 58.1991 14.4381 0.3494 [0.6566037735849056, 0.47794117647058826, 0.38925876608965826, 0.32466181061394384] 0.7830 0.8035 2915 3628
No log 5.0 291 0.8460 59.0765 45.7893 57.418 57.573 14.5196 0.3450 [0.6488601565158217, 0.46702453987730064, 0.3790074659639877, 0.31500513874614594] 0.7910 0.8101 2939 3628
No log 5.99 349 0.8427 58.3394 44.9653 56.6741 56.7472 14.6254 0.3439 [0.6412933647692826, 0.4586808188021228, 0.3740788903337668, 0.30870445344129555] 0.8009 0.8184 2969 3628
No log 6.99 407 0.8398 59.5629 46.3188 58.0534 58.1472 14.5257 0.3549 [0.65625, 0.47722923842326825, 0.39176161262050835, 0.32752434648898] 0.7927 0.8115 2944 3628
No log 7.97 464 0.8388 59.6054 46.5045 58.1566 58.1981 14.5287 0.3568 [0.6571719226856562, 0.4774637127578304, 0.3935286401399213, 0.32975460122699385] 0.7943 0.8128 2949 3628

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
2
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from