Edit model card

speller-t5-9

This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1614
  • Rouge1: 14.9554
  • Rouge2: 8.3333
  • Rougel: 14.9554
  • Rougelsum: 14.9554
  • Gen Len: 42.8661

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.0873 0.04 500 0.5259 13.7946 7.1429 13.8393 13.8393 40.7946
0.6932 0.07 1000 0.3914 14.0625 8.3333 14.0625 14.0625 43.5357
0.5471 0.11 1500 0.3349 13.9633 7.9507 13.8641 13.9633 45.0089
0.5566 0.14 2000 0.2954 14.0625 8.3333 14.0625 14.0625 43.1429
0.4985 0.18 2500 0.2802 14.9554 8.3333 14.9554 14.9554 44.125
0.5175 0.22 3000 0.2631 14.9554 8.3333 14.9554 14.9554 44.4286
0.4377 0.25 3500 0.2431 14.9554 8.3333 14.9554 14.9554 42.5893
0.4356 0.29 4000 0.2315 14.9554 8.3333 14.9554 14.9554 42.9286
0.4052 0.32 4500 0.2258 14.9554 8.3333 14.9554 14.9554 43.2232
0.3888 0.36 5000 0.2179 14.9554 8.3333 14.9554 14.9554 42.6607
0.3731 0.39 5500 0.2063 14.9554 8.3333 14.9554 14.9554 42.9196
0.436 0.43 6000 0.2075 14.9554 8.3333 14.9554 14.9554 42.7589
0.42 0.47 6500 0.1993 14.9554 8.3333 14.9554 14.9554 42.5446
0.378 0.5 7000 0.2036 14.9554 8.3333 14.9554 14.9554 43.0179
0.3431 0.54 7500 0.1914 14.9554 8.3333 14.9554 14.9554 42.6875
0.3574 0.57 8000 0.1852 14.9554 8.3333 14.9554 14.9554 42.7321
0.302 0.61 8500 0.1900 14.9554 8.3333 14.9554 14.9554 42.7946
0.3081 0.65 9000 0.1807 14.9554 8.3333 14.9554 14.9554 42.7054
0.3266 0.68 9500 0.1755 14.9554 8.3333 14.9554 14.9554 42.5714
0.3834 0.72 10000 0.1726 14.9554 8.3333 14.9554 14.9554 42.8482
0.2802 0.75 10500 0.1736 14.9554 8.3333 14.9554 14.9554 42.8036
0.3013 0.79 11000 0.1675 14.9554 8.3333 14.9554 14.9554 42.7054
0.3404 0.83 11500 0.1630 14.9554 8.3333 14.9554 14.9554 42.6786
0.2945 0.86 12000 0.1627 14.9554 8.3333 14.9554 14.9554 42.6607
0.2819 0.9 12500 0.1633 14.9554 8.3333 14.9554 14.9554 42.7321
0.3028 0.93 13000 0.1597 14.9554 8.3333 14.9554 14.9554 42.6429
0.3138 0.97 13500 0.1614 14.9554 8.3333 14.9554 14.9554 42.8661

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.13.1+cu116
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
2