Edit model card

speller-t5-big

This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1571
  • Rouge1: 23.6607
  • Rouge2: 10.9056
  • Rougel: 23.8095
  • Rougelsum: 23.8095
  • Gen Len: 45.0357

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.1176 0.04 500 0.5179 20.9642 10.0469 20.7738 20.9865 53.2232
0.8525 0.07 1000 0.3809 20.4241 9.7577 20.5357 20.6659 46.9196
0.677 0.11 1500 0.3261 21.131 9.6429 21.1607 21.4583 47.125
0.5804 0.14 2000 0.3008 22.7307 10.9056 22.8051 22.9167 46.7321
0.5251 0.18 2500 0.2805 22.3214 10.6737 22.4702 22.619 47.1339
0.5026 0.21 3000 0.2676 22.5298 10.7639 22.6488 22.6786 47.4464
0.433 0.25 3500 0.2515 23.6607 10.9056 23.8095 23.8095 46.8929
0.5218 0.29 4000 0.2383 23.6607 10.9056 23.8095 23.8095 46.6071
0.45 0.32 4500 0.2300 23.6607 10.9056 23.8095 23.8095 46.8304
0.3818 0.36 5000 0.2270 23.6607 10.9056 23.8095 23.8095 46.5625
0.4404 0.39 5500 0.2192 23.6607 10.9056 23.8095 23.8095 45.3304
0.3476 0.43 6000 0.2099 23.6607 10.9056 23.8095 23.8095 45.2857
0.3802 0.47 6500 0.2066 23.6607 10.9056 23.8095 23.8095 47.0
0.3423 0.5 7000 0.2003 23.6607 10.9056 23.8095 23.8095 45.3839
0.3288 0.54 7500 0.1924 23.6607 10.9056 23.8095 23.8095 45.7768
0.3788 0.57 8000 0.1897 23.6607 10.9056 23.8095 23.8095 46.3125
0.3475 0.61 8500 0.1905 23.6607 10.9056 23.8095 23.8095 45.8571
0.3229 0.64 9000 0.1829 22.9167 10.2679 22.9167 22.9167 46.6161
0.2918 0.68 9500 0.1840 23.6607 10.9056 23.8095 23.8095 46.0714
0.3683 0.72 10000 0.1834 22.9167 10.2679 22.9167 22.9167 46.8393
0.3074 0.75 10500 0.1758 23.6607 10.9056 23.8095 23.8095 45.875
0.329 0.79 11000 0.1686 22.9167 10.2679 22.9167 22.9167 46.7589
0.2692 0.82 11500 0.1699 22.9167 10.2679 22.9167 22.9167 46.6696
0.2775 0.86 12000 0.1732 23.4375 10.523 23.4375 23.5119 46.2232
0.2754 0.9 12500 0.1643 22.9167 10.2679 22.9167 22.9167 46.3125
0.3348 0.93 13000 0.1611 22.9167 10.2679 22.9167 22.9167 46.2411
0.2875 0.97 13500 0.1571 23.6607 10.9056 23.8095 23.8095 45.0357

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.13.1+cu116
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
8