speller-t5-90
This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1486
- Rouge1: 19.3503
- Rouge2: 8.3898
- Rougel: 19.4209
- Rougelsum: 19.4915
- Gen Len: 41.3136
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.3435 | 0.03 | 500 | 0.2100 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.4492 |
0.3245 | 0.07 | 1000 | 0.2102 | 19.5975 | 8.7571 | 19.7034 | 19.774 | 41.1949 |
0.3777 | 0.1 | 1500 | 0.2010 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.0 |
0.3643 | 0.14 | 2000 | 0.1980 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.0593 |
0.3212 | 0.17 | 2500 | 0.1986 | 19.209 | 8.2062 | 19.2797 | 19.2797 | 41.1525 |
0.4181 | 0.2 | 3000 | 0.1896 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 42.2373 |
0.3175 | 0.24 | 3500 | 0.1879 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.4576 |
0.3399 | 0.27 | 4000 | 0.1838 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.1102 |
0.314 | 0.31 | 4500 | 0.1837 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.0339 |
0.3063 | 0.34 | 5000 | 0.1796 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 40.9407 |
0.3434 | 0.38 | 5500 | 0.1769 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 40.8814 |
0.376 | 0.41 | 6000 | 0.1790 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.0593 |
0.3355 | 0.44 | 6500 | 0.1735 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.4153 |
0.3181 | 0.48 | 7000 | 0.1665 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.0508 |
0.3017 | 0.51 | 7500 | 0.1701 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.2881 |
0.2953 | 0.55 | 8000 | 0.1664 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.2458 |
0.2711 | 0.58 | 8500 | 0.1664 | 19.5975 | 8.7571 | 19.7034 | 19.774 | 41.4068 |
0.3661 | 0.61 | 9000 | 0.1626 | 19.5975 | 8.7571 | 19.7034 | 19.774 | 41.2797 |
0.273 | 0.65 | 9500 | 0.1585 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.3051 |
0.3346 | 0.68 | 10000 | 0.1627 | 19.5975 | 8.7571 | 19.7034 | 19.774 | 41.2797 |
0.2529 | 0.72 | 10500 | 0.1590 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.2627 |
0.2926 | 0.75 | 11000 | 0.1601 | 19.5975 | 8.7571 | 19.7034 | 19.774 | 41.2712 |
0.2677 | 0.78 | 11500 | 0.1551 | 19.5975 | 8.7571 | 19.7034 | 19.774 | 41.2797 |
0.2746 | 0.82 | 12000 | 0.1570 | 19.5975 | 8.7571 | 19.7034 | 19.774 | 41.1186 |
0.2494 | 0.85 | 12500 | 0.1513 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.2373 |
0.2834 | 0.89 | 13000 | 0.1506 | 19.5975 | 8.7571 | 19.7034 | 19.774 | 41.2458 |
0.2646 | 0.92 | 13500 | 0.1512 | 19.5975 | 8.7571 | 19.7034 | 19.774 | 41.3729 |
0.2782 | 0.95 | 14000 | 0.1528 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.3644 |
0.2954 | 0.99 | 14500 | 0.1486 | 19.3503 | 8.3898 | 19.4209 | 19.4915 | 41.3136 |
Framework versions
- Transformers 4.26.0
- Pytorch 1.7.1+cu110
- Datasets 2.9.0
- Tokenizers 0.13.2
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.