Edit model card

speller-t5-90

This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1486
  • Rouge1: 19.3503
  • Rouge2: 8.3898
  • Rougel: 19.4209
  • Rougelsum: 19.4915
  • Gen Len: 41.3136

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.3435 0.03 500 0.2100 19.3503 8.3898 19.4209 19.4915 41.4492
0.3245 0.07 1000 0.2102 19.5975 8.7571 19.7034 19.774 41.1949
0.3777 0.1 1500 0.2010 19.3503 8.3898 19.4209 19.4915 41.0
0.3643 0.14 2000 0.1980 19.3503 8.3898 19.4209 19.4915 41.0593
0.3212 0.17 2500 0.1986 19.209 8.2062 19.2797 19.2797 41.1525
0.4181 0.2 3000 0.1896 19.3503 8.3898 19.4209 19.4915 42.2373
0.3175 0.24 3500 0.1879 19.3503 8.3898 19.4209 19.4915 41.4576
0.3399 0.27 4000 0.1838 19.3503 8.3898 19.4209 19.4915 41.1102
0.314 0.31 4500 0.1837 19.3503 8.3898 19.4209 19.4915 41.0339
0.3063 0.34 5000 0.1796 19.3503 8.3898 19.4209 19.4915 40.9407
0.3434 0.38 5500 0.1769 19.3503 8.3898 19.4209 19.4915 40.8814
0.376 0.41 6000 0.1790 19.3503 8.3898 19.4209 19.4915 41.0593
0.3355 0.44 6500 0.1735 19.3503 8.3898 19.4209 19.4915 41.4153
0.3181 0.48 7000 0.1665 19.3503 8.3898 19.4209 19.4915 41.0508
0.3017 0.51 7500 0.1701 19.3503 8.3898 19.4209 19.4915 41.2881
0.2953 0.55 8000 0.1664 19.3503 8.3898 19.4209 19.4915 41.2458
0.2711 0.58 8500 0.1664 19.5975 8.7571 19.7034 19.774 41.4068
0.3661 0.61 9000 0.1626 19.5975 8.7571 19.7034 19.774 41.2797
0.273 0.65 9500 0.1585 19.3503 8.3898 19.4209 19.4915 41.3051
0.3346 0.68 10000 0.1627 19.5975 8.7571 19.7034 19.774 41.2797
0.2529 0.72 10500 0.1590 19.3503 8.3898 19.4209 19.4915 41.2627
0.2926 0.75 11000 0.1601 19.5975 8.7571 19.7034 19.774 41.2712
0.2677 0.78 11500 0.1551 19.5975 8.7571 19.7034 19.774 41.2797
0.2746 0.82 12000 0.1570 19.5975 8.7571 19.7034 19.774 41.1186
0.2494 0.85 12500 0.1513 19.3503 8.3898 19.4209 19.4915 41.2373
0.2834 0.89 13000 0.1506 19.5975 8.7571 19.7034 19.774 41.2458
0.2646 0.92 13500 0.1512 19.5975 8.7571 19.7034 19.774 41.3729
0.2782 0.95 14000 0.1528 19.3503 8.3898 19.4209 19.4915 41.3644
0.2954 0.99 14500 0.1486 19.3503 8.3898 19.4209 19.4915 41.3136

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.7.1+cu110
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.