Edit model card

t5-v1_1-large-gramatika1500k

This model is a fine-tuned version of google/t5-v1_1-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0386
  • Rouge1: 52.2432
  • Rouge2: 46.3929
  • Rougel: 52.1914
  • Rougelsum: 52.1955
  • Gen Len: 18.9096

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adafactor
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.1341 1.33 100000 0.0603 51.1478 44.5632 51.0653 51.0706 18.9107
0.0608 2.67 200000 0.0469 51.7198 45.5159 51.6566 51.6625 18.9102
0.0465 4.0 300000 0.0417 51.97 45.93 51.9094 51.9137 18.9101
0.0375 5.33 400000 0.0402 52.1056 46.1587 52.0509 52.0577 18.9095
0.0322 6.67 500000 0.0388 52.1861 46.2939 52.1316 52.1371 18.9095
0.0285 8.0 600000 0.0386 52.2432 46.3929 52.1914 52.1955 18.9096
0.0253 9.34 700000 0.0390 52.2683 46.4315 52.2181 52.224 18.9094

Framework versions

  • Transformers 4.31.0
  • Pytorch 1.11.0a0+b6df043
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8

Finetuned from