Edit model card

FlanT5_Grammar_Correction

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5641
  • Rouge1: 25.8411
  • Rouge2: 21.9266
  • Rougel: 25.6021
  • Rougelsum: 25.8204
  • Gen Len: 18.66

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 10
  • eval_batch_size: 10
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 70 0.6908 24.6523 20.2616 24.4437 24.5628 18.7067
No log 2.0 140 0.6518 24.8929 20.7179 24.6941 24.82 18.6933
No log 3.0 210 0.6094 25.3966 21.4469 25.1476 25.3394 18.66
No log 4.0 280 0.6017 25.6059 21.6233 25.3733 25.602 18.65
No log 5.0 350 0.5839 25.6422 21.6618 25.399 25.6115 18.6533
No log 6.0 420 0.5743 25.6713 21.6384 25.4196 25.645 18.6533
No log 7.0 490 0.5710 25.7888 21.8155 25.5426 25.7576 18.6667
0.7559 8.0 560 0.5669 25.8358 21.8943 25.5741 25.8059 18.6667
0.7559 9.0 630 0.5651 25.8471 21.9295 25.606 25.8206 18.66
0.7559 10.0 700 0.5641 25.8411 21.9266 25.6021 25.8204 18.66

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.0
  • Tokenizers 0.15.0
Downloads last month
9
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from

Space using Floyd93/FlanT5_Grammar_Correction 1