Edit model card

speller-t5-big-3

This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1829
  • Rouge1: 27.4616
  • Rouge2: 11.1083
  • Rougel: 27.5146
  • Rougelsum: 27.3079
  • Gen Len: 39.1171

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.0936 0.04 500 0.5587 23.1392 7.0032 23.1709 23.1908 41.1081
0.8042 0.07 1000 0.4168 25.1867 8.9696 25.2993 25.1779 43.6486
0.634 0.11 1500 0.3611 26.0366 8.521 26.1568 25.9359 40.2613
0.5041 0.14 2000 0.3255 26.1019 8.7002 26.2473 25.983 40.7928
0.5279 0.18 2500 0.3041 26.1352 8.6265 26.2606 25.9482 39.6216
0.4838 0.22 3000 0.2784 26.6137 9.8094 26.8372 26.5692 39.3694
0.4512 0.25 3500 0.2700 25.6152 9.5832 25.7503 25.6898 38.7387
0.4412 0.29 4000 0.2612 25.6113 9.6697 25.7482 25.6838 39.1171
0.405 0.33 4500 0.2426 26.5151 9.6882 26.7719 26.4825 39.1892
0.3987 0.36 5000 0.2390 26.479 9.6144 26.6499 26.3759 39.0991
0.407 0.4 5500 0.2325 26.4499 9.6544 26.6649 26.3821 39.3784
0.406 0.43 6000 0.2266 26.6224 9.875 26.8468 26.6058 38.6486
0.3827 0.47 6500 0.2213 26.8997 10.0139 27.1249 26.8252 39.1712
0.334 0.51 7000 0.2247 26.7779 9.9399 26.9951 26.6453 39.7207
0.3463 0.54 7500 0.2145 26.879 9.9911 27.0863 26.7372 39.2432
0.3439 0.58 8000 0.2102 26.8839 10.0139 27.0715 26.7186 39.3694
0.3644 0.61 8500 0.2050 26.9076 10.0704 27.1328 26.8411 39.2252
0.3161 0.65 9000 0.2008 26.9219 10.1927 27.1542 26.8697 38.7928
0.3273 0.69 9500 0.2018 26.8221 9.9879 27.0473 26.7137 39.1892
0.3423 0.72 10000 0.1992 26.8572 10.0937 27.0701 26.7469 39.2342
0.3129 0.76 10500 0.1964 26.9076 10.0704 27.1328 26.8411 39.1712
0.2841 0.79 11000 0.1937 27.4202 10.9493 27.5146 27.2724 39.1261
0.2865 0.83 11500 0.1901 27.4559 11.0314 27.5146 27.3022 39.2072
0.2747 0.87 12000 0.1862 27.4127 10.9878 27.5146 27.2611 38.9459
0.2766 0.9 12500 0.1905 27.4616 11.1083 27.5146 27.3079 39.0991
0.3 0.94 13000 0.1866 27.4616 11.1083 27.5146 27.3079 39.0541
0.2729 0.98 13500 0.1829 27.4616 11.1083 27.5146 27.3079 39.1171

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.13.1+cu116
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
2
Inference API
This model can be loaded on Inference API (serverless).