Edit model card

speller-t5-9001

This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1587
  • Rouge1: 17.0762
  • Rouge2: 5.6336
  • Rougel: 17.1181
  • Rougelsum: 17.2316
  • Gen Len: 40.2034

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.172 0.03 500 0.5659 14.1669 4.1265 13.7878 14.1044 42.7458
0.7063 0.07 1000 0.4207 14.5638 4.8305 14.4688 14.5907 43.8898
0.6604 0.1 1500 0.3557 16.2672 4.8685 16.2308 16.3516 43.8644
0.5429 0.14 2000 0.3266 16.6436 5.1161 16.6667 16.6872 43.4831
0.5245 0.17 2500 0.2964 16.6667 5.1963 16.6775 16.7707 42.3983
0.5812 0.2 3000 0.2757 16.6969 5.339 16.7331 16.8449 41.3051
0.5019 0.24 3500 0.2626 16.686 5.4462 16.6815 16.8733 40.7203
0.4182 0.27 4000 0.2531 16.7142 5.5085 16.6667 16.9373 40.6102
0.4592 0.31 4500 0.2413 16.947 5.5404 16.9581 17.059 40.1441
0.4626 0.34 5000 0.2299 16.9492 5.6063 16.944 17.0235 40.3475
0.4158 0.38 5500 0.2228 16.8653 5.5608 16.9429 17.0407 39.5085
0.4261 0.41 6000 0.2185 16.9293 5.5843 16.9492 17.0365 39.8814
0.4465 0.44 6500 0.2088 16.9492 5.5968 16.9895 17.1106 39.4746
0.3919 0.48 7000 0.2015 16.9492 5.5843 16.9839 17.0937 39.6864
0.3994 0.51 7500 0.2023 17.0836 5.6632 17.0588 17.1895 40.5932
0.466 0.55 8000 0.1968 17.1664 5.7257 17.1664 17.3019 40.4153
0.419 0.58 8500 0.1899 17.0132 5.6021 17.0625 17.1945 39.4831
0.4047 0.61 9000 0.1877 17.0418 5.6217 16.9895 17.1106 39.9237
0.3728 0.65 9500 0.1798 16.9856 5.5876 16.9947 17.1612 39.4237
0.3685 0.68 10000 0.1768 16.9856 5.6249 16.9492 17.1339 39.2966
0.4241 0.72 10500 0.1739 16.9908 5.595 17.0532 17.1845 39.3814
0.3006 0.75 11000 0.1740 16.9492 5.5799 16.9802 17.1525 39.5085
0.339 0.78 11500 0.1739 17.0495 5.6497 17.047 17.1796 39.8136
0.3387 0.82 12000 0.1711 16.9908 5.595 17.0532 17.1845 39.4746
0.3116 0.85 12500 0.1642 16.9492 5.5799 16.9802 17.1525 39.161
0.3112 0.89 13000 0.1620 17.0021 5.6076 17.0374 17.1719 39.178
0.341 0.92 13500 0.1638 17.1664 5.7473 17.2279 17.3384 40.1864
0.2885 0.95 14000 0.1609 17.1664 5.7931 17.2504 17.3565 40.1356
0.3335 0.99 14500 0.1587 17.0762 5.6336 17.1181 17.2316 40.2034

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.7.1+cu110
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.