speller-t5-9001
This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1587
- Rouge1: 17.0762
- Rouge2: 5.6336
- Rougel: 17.1181
- Rougelsum: 17.2316
- Gen Len: 40.2034
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
1.172 | 0.03 | 500 | 0.5659 | 14.1669 | 4.1265 | 13.7878 | 14.1044 | 42.7458 |
0.7063 | 0.07 | 1000 | 0.4207 | 14.5638 | 4.8305 | 14.4688 | 14.5907 | 43.8898 |
0.6604 | 0.1 | 1500 | 0.3557 | 16.2672 | 4.8685 | 16.2308 | 16.3516 | 43.8644 |
0.5429 | 0.14 | 2000 | 0.3266 | 16.6436 | 5.1161 | 16.6667 | 16.6872 | 43.4831 |
0.5245 | 0.17 | 2500 | 0.2964 | 16.6667 | 5.1963 | 16.6775 | 16.7707 | 42.3983 |
0.5812 | 0.2 | 3000 | 0.2757 | 16.6969 | 5.339 | 16.7331 | 16.8449 | 41.3051 |
0.5019 | 0.24 | 3500 | 0.2626 | 16.686 | 5.4462 | 16.6815 | 16.8733 | 40.7203 |
0.4182 | 0.27 | 4000 | 0.2531 | 16.7142 | 5.5085 | 16.6667 | 16.9373 | 40.6102 |
0.4592 | 0.31 | 4500 | 0.2413 | 16.947 | 5.5404 | 16.9581 | 17.059 | 40.1441 |
0.4626 | 0.34 | 5000 | 0.2299 | 16.9492 | 5.6063 | 16.944 | 17.0235 | 40.3475 |
0.4158 | 0.38 | 5500 | 0.2228 | 16.8653 | 5.5608 | 16.9429 | 17.0407 | 39.5085 |
0.4261 | 0.41 | 6000 | 0.2185 | 16.9293 | 5.5843 | 16.9492 | 17.0365 | 39.8814 |
0.4465 | 0.44 | 6500 | 0.2088 | 16.9492 | 5.5968 | 16.9895 | 17.1106 | 39.4746 |
0.3919 | 0.48 | 7000 | 0.2015 | 16.9492 | 5.5843 | 16.9839 | 17.0937 | 39.6864 |
0.3994 | 0.51 | 7500 | 0.2023 | 17.0836 | 5.6632 | 17.0588 | 17.1895 | 40.5932 |
0.466 | 0.55 | 8000 | 0.1968 | 17.1664 | 5.7257 | 17.1664 | 17.3019 | 40.4153 |
0.419 | 0.58 | 8500 | 0.1899 | 17.0132 | 5.6021 | 17.0625 | 17.1945 | 39.4831 |
0.4047 | 0.61 | 9000 | 0.1877 | 17.0418 | 5.6217 | 16.9895 | 17.1106 | 39.9237 |
0.3728 | 0.65 | 9500 | 0.1798 | 16.9856 | 5.5876 | 16.9947 | 17.1612 | 39.4237 |
0.3685 | 0.68 | 10000 | 0.1768 | 16.9856 | 5.6249 | 16.9492 | 17.1339 | 39.2966 |
0.4241 | 0.72 | 10500 | 0.1739 | 16.9908 | 5.595 | 17.0532 | 17.1845 | 39.3814 |
0.3006 | 0.75 | 11000 | 0.1740 | 16.9492 | 5.5799 | 16.9802 | 17.1525 | 39.5085 |
0.339 | 0.78 | 11500 | 0.1739 | 17.0495 | 5.6497 | 17.047 | 17.1796 | 39.8136 |
0.3387 | 0.82 | 12000 | 0.1711 | 16.9908 | 5.595 | 17.0532 | 17.1845 | 39.4746 |
0.3116 | 0.85 | 12500 | 0.1642 | 16.9492 | 5.5799 | 16.9802 | 17.1525 | 39.161 |
0.3112 | 0.89 | 13000 | 0.1620 | 17.0021 | 5.6076 | 17.0374 | 17.1719 | 39.178 |
0.341 | 0.92 | 13500 | 0.1638 | 17.1664 | 5.7473 | 17.2279 | 17.3384 | 40.1864 |
0.2885 | 0.95 | 14000 | 0.1609 | 17.1664 | 5.7931 | 17.2504 | 17.3565 | 40.1356 |
0.3335 | 0.99 | 14500 | 0.1587 | 17.0762 | 5.6336 | 17.1181 | 17.2316 | 40.2034 |
Framework versions
- Transformers 4.26.0
- Pytorch 1.7.1+cu110
- Datasets 2.9.0
- Tokenizers 0.13.2
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.