Edit model card

oop-de-qg-flan-t5-base-v5

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8305
  • Rouge1: 60.2858
  • Rouge2: 47.0551
  • Rougel: 58.5541
  • Rougelsum: 58.5986
  • Gen Len: 14.6254
  • Bleu: 0.3585
  • Precisions: [0.6612685560053981, 0.4800607671857197, 0.39139878366637704, 0.3257229832572298]
  • Brevity Penalty: 0.7993
  • Length Ratio: 0.8170
  • Translation Length: 2964
  • Reference Length: 3628

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length
No log 0.99 72 0.9838 58.281 44.4811 56.6252 56.6047 14.6042 0.3304 [0.6428324697754749, 0.4543681747269891, 0.367666815942678, 0.30546792849631965] 0.7763 0.7980 2895 3628
No log 1.99 145 0.9010 55.8534 42.0605 54.3596 54.3148 14.6586 0.3076 [0.6021433355659745, 0.41167608286252355, 0.3253012048192771, 0.26241846462619167] 0.8065 0.8230 2986 3628
No log 3.0 218 0.8767 57.7174 44.1283 56.4402 56.3292 14.5136 0.3323 [0.6361781706902414, 0.4509578544061303, 0.36287845546292236, 0.2982546201232033] 0.7917 0.8106 2941 3628
No log 4.0 291 0.8583 60.2113 47.3135 58.8257 58.7408 14.3233 0.3580 [0.6711758584807492, 0.49490595611285265, 0.4074741107609185, 0.3412698412698413] 0.7723 0.7947 2883 3628
No log 4.99 363 0.8396 59.8588 46.8718 58.3234 58.2478 14.4894 0.3539 [0.6580469547465124, 0.47929447852760737, 0.39042599912165127, 0.32528263103802674] 0.7910 0.8101 2939 3628
No log 5.99 436 0.8316 59.7653 46.5459 58.066 58.1354 14.4804 0.3548 [0.6613342409802587, 0.4798619102416571, 0.3914762741652021, 0.3264781491002571] 0.7907 0.8098 2938 3628
0.9411 7.0 509 0.8305 60.2858 47.0551 58.5541 58.5986 14.6254 0.3585 [0.6612685560053981, 0.4800607671857197, 0.39139878366637704, 0.3257229832572298] 0.7993 0.8170 2964 3628
0.9411 7.92 576 0.8309 60.2226 47.1068 58.611 58.5902 14.6526 0.3605 [0.6590450571620713, 0.4801362088535755, 0.39273356401384085, 0.3276123170116103] 0.8026 0.8197 2974 3628

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
2
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from