Edit model card

oop-de-qg-flan-t5-base-v3

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: nan
  • Rouge1: 8.0858
  • Rouge2: 3.0935
  • Rougel: 7.2494
  • Rougelsum: 7.3009
  • Gen Len: 58.0151
  • Bleu: 0.0107
  • Precisions: [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211]
  • Brevity Penalty: 1.0
  • Length Ratio: 4.2235
  • Translation Length: 15323
  • Reference Length: 3628

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length
No log 1.0 291 nan 8.0858 3.0935 7.2494 7.3009 58.0151 0.0107 [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211] 1.0 4.2235 15323 3628
0.0 2.0 582 nan 8.0858 3.0935 7.2494 7.3009 58.0151 0.0107 [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211] 1.0 4.2235 15323 3628
0.0 3.0 873 nan 8.0858 3.0935 7.2494 7.3009 58.0151 0.0107 [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211] 1.0 4.2235 15323 3628
0.0 4.0 1164 nan 8.0858 3.0935 7.2494 7.3009 58.0151 0.0107 [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211] 1.0 4.2235 15323 3628
0.0 5.0 1455 nan 8.0858 3.0935 7.2494 7.3009 58.0151 0.0107 [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211] 1.0 4.2235 15323 3628
0.0 6.0 1746 nan 8.0858 3.0935 7.2494 7.3009 58.0151 0.0107 [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211] 1.0 4.2235 15323 3628
0.0 7.0 2037 nan 8.0858 3.0935 7.2494 7.3009 58.0151 0.0107 [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211] 1.0 4.2235 15323 3628
0.0 8.0 2328 nan 8.0858 3.0935 7.2494 7.3009 58.0151 0.0107 [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211] 1.0 4.2235 15323 3628
0.0 9.0 2619 nan 8.0858 3.0935 7.2494 7.3009 58.0151 0.0107 [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211] 1.0 4.2235 15323 3628
0.0 10.0 2910 nan 8.0858 3.0935 7.2494 7.3009 58.0151 0.0107 [0.04098414148665405, 0.014941302027748132, 0.007025441647909419, 0.003000697836706211] 1.0 4.2235 15323 3628

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
2
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from