Edit model card

speller-t5-900

This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1758
  • Rouge1: 19.3503
  • Rouge2: 8.3333
  • Rougel: 19.3503
  • Rougelsum: 19.3503
  • Gen Len: 41.4153

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.0227 0.03 500 0.5411 17.6201 7.1186 17.6554 17.5847 45.5424
0.7224 0.07 1000 0.4269 18.1497 7.1186 18.1497 17.9732 42.7797
0.7101 0.1 1500 0.3542 18.9972 7.9661 18.9972 18.9619 42.3983
0.5962 0.14 2000 0.3283 18.9972 7.9661 18.9972 18.9619 42.2542
0.535 0.17 2500 0.3104 18.9972 7.9661 18.9972 18.9619 42.2627
0.6124 0.2 3000 0.2843 18.9972 7.9661 18.9972 18.9619 42.4915
0.491 0.24 3500 0.2706 18.9972 7.9661 18.9972 18.9619 42.4322
0.5028 0.27 4000 0.2647 19.5429 8.5876 19.5429 19.5621 42.3898
0.4547 0.31 4500 0.2548 18.9972 7.9661 18.9972 18.9619 42.178
0.4335 0.34 5000 0.2448 19.5429 8.5876 19.5429 19.5621 42.178
0.4511 0.38 5500 0.2377 19.4915 8.5876 19.4915 19.4915 42.3305
0.4765 0.41 6000 0.2337 19.5429 8.5876 19.5429 19.5621 41.4237
0.4355 0.44 6500 0.2233 19.4915 8.5876 19.4915 19.4915 41.7881
0.3924 0.48 7000 0.2172 19.4915 8.5876 19.4915 19.4915 40.9492
0.3898 0.51 7500 0.2153 19.4915 8.5876 19.4915 19.4915 41.6356
0.4236 0.55 8000 0.2102 19.4915 8.5876 19.4915 19.4915 41.0254
0.3484 0.58 8500 0.2116 19.4915 8.5876 19.4915 19.4915 41.8305
0.5514 0.61 9000 0.2017 19.6328 8.7571 19.5975 19.6328 41.1864
0.3298 0.65 9500 0.1945 19.6328 8.7571 19.5975 19.6328 41.2966
0.3807 0.68 10000 0.1966 19.6328 8.7571 19.5975 19.6328 41.6525
0.3177 0.72 10500 0.1918 19.3503 8.3333 19.3503 19.3503 41.2627
0.3374 0.75 11000 0.1903 19.6328 8.7571 19.5975 19.6328 41.2373
0.3123 0.78 11500 0.1900 19.6328 8.7571 19.5975 19.6328 41.2203
0.3377 0.82 12000 0.1847 19.6328 8.7571 19.5975 19.6328 41.2712
0.3138 0.85 12500 0.1814 19.6328 8.7571 19.5975 19.6328 41.1864
0.335 0.89 13000 0.1784 19.6328 8.7571 19.5975 19.6328 41.1695
0.3142 0.92 13500 0.1768 19.6328 8.7571 19.5975 19.6328 41.2542
0.3245 0.95 14000 0.1753 19.6328 8.7571 19.5975 19.6328 41.2034
0.3277 0.99 14500 0.1758 19.3503 8.3333 19.3503 19.3503 41.4153

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.7.1+cu110
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
2