Edit model card

deductor-flant5-large

This model is a fine-tuned version of google/flan-t5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2461
  • Rouge1: 92.1213
  • Rouge2: 86.4281
  • Rougel: 90.5846
  • Rougelsum: 90.5294
  • Gen Len: 11.2014

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.306 0.19 50 0.2959 89.3028 82.5127 87.4173 87.3544 11.2211
0.2774 0.38 100 0.2717 90.8414 84.2378 88.9385 88.9058 11.2571
0.2366 0.57 150 0.2613 91.0152 84.6687 89.2107 89.1735 11.2081
0.2166 0.77 200 0.2585 91.5215 85.4308 89.7742 89.7422 11.2802
0.22 0.96 250 0.2517 91.5587 85.6107 89.8835 89.8621 11.2655
0.1564 1.15 300 0.2630 91.999 86.0835 90.3611 90.3168 11.2039
0.1803 1.34 350 0.2546 91.5183 85.6214 89.9752 89.9323 11.2462
0.1737 1.53 400 0.2483 91.8342 86.0171 90.3042 90.2641 11.1943
0.157 1.72 450 0.2493 91.6585 85.4651 90.0181 89.9991 10.9376
0.1561 1.92 500 0.2461 92.1213 86.4281 90.5846 90.5294 11.2014
0.1191 2.11 550 0.2585 92.4493 86.6961 90.9293 90.8761 11.2416
0.1134 2.3 600 0.2633 92.4707 86.833 90.9516 90.9195 11.1675
0.1227 2.49 650 0.2592 92.2738 86.5064 90.7556 90.6998 11.2642
0.1175 2.68 700 0.2657 92.0861 86.2203 90.6168 90.5657 11.1700
0.1132 2.87 750 0.2644 92.3834 86.7237 90.8761 90.8389 11.2123
0.1097 3.07 800 0.2692 92.3356 86.7021 90.8717 90.8185 11.1822
0.0949 3.26 850 0.2690 92.5746 87.001 91.1734 91.1222 11.2785
0.0813 3.45 900 0.2875 92.5641 86.9813 91.0881 91.0411 11.2257
0.0861 3.64 950 0.2800 92.4738 86.9379 91.0384 90.9995 11.2136
0.0879 3.83 1000 0.2770 92.6025 87.105 91.1632 91.1292 11.2303

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.0.1
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
22
Safetensors
Model size
783M params
Tensor type
F32
·

Finetuned from