Edit model card

speller-t5-big-2

This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1711
  • Rouge1: 22.619
  • Rouge2: 10.523
  • Rougel: 22.619
  • Rougelsum: 22.619
  • Gen Len: 42.9107

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.244 0.04 500 0.5814 18.4902 6.4123 18.3883 18.5119 48.8214
0.6967 0.07 1000 0.4315 20.0 7.2173 20.0744 19.9702 47.0357
0.6362 0.11 1500 0.3721 21.1905 8.514 21.131 21.1607 47.3929
0.5561 0.14 2000 0.3265 22.0238 9.29 21.9643 21.994 45.6696
0.5094 0.18 2500 0.3049 22.0238 9.29 21.9643 21.994 46.0
0.429 0.21 3000 0.2858 22.0238 9.29 21.9643 21.994 44.9464
0.4557 0.25 3500 0.2696 22.1726 9.4388 22.0238 22.0982 45.2054
0.4268 0.29 4000 0.2565 22.1726 9.4388 22.0238 22.0982 44.5268
0.3955 0.32 4500 0.2480 22.1726 9.4388 22.0238 22.0982 44.2589
0.3672 0.36 5000 0.2387 22.619 10.523 22.619 22.619 44.2946
0.4059 0.39 5500 0.2268 22.619 10.523 22.619 22.619 44.1429
0.4005 0.43 6000 0.2216 22.619 10.523 22.619 22.619 44.4911
0.4176 0.47 6500 0.2187 22.619 10.523 22.619 22.619 44.1339
0.3413 0.5 7000 0.2115 22.619 10.523 22.619 22.619 43.9732
0.3618 0.54 7500 0.2068 22.619 10.523 22.619 22.619 43.9821
0.3157 0.57 8000 0.2037 22.619 10.523 22.619 22.619 43.0714
0.3502 0.61 8500 0.1956 22.619 10.523 22.619 22.619 42.8214
0.353 0.64 9000 0.1932 22.619 10.523 22.619 22.619 42.8393
0.3516 0.68 9500 0.1891 22.619 10.523 22.619 22.619 42.2589
0.3225 0.72 10000 0.1836 22.619 10.523 22.619 22.619 42.1964
0.2993 0.75 10500 0.1818 22.619 10.523 22.619 22.619 43.6607
0.3353 0.79 11000 0.1814 22.619 10.523 22.619 22.619 42.4018
0.3325 0.82 11500 0.1807 22.619 10.523 22.619 22.619 43.1786
0.3181 0.86 12000 0.1752 22.619 10.523 22.619 22.619 43.25
0.3337 0.9 12500 0.1729 22.619 10.523 22.619 22.619 42.3929
0.281 0.93 13000 0.1737 22.619 10.523 22.619 22.619 43.8214
0.45 0.97 13500 0.1711 22.619 10.523 22.619 22.619 42.9107

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.13.1+cu116
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
0