Edit model card

speller-t5-909_both_

This model is a fine-tuned version of sberbank-ai/ruT5-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0771
  • Rouge1: 20.0565
  • Rouge2: 7.9096
  • Rougel: 20.1271
  • Rougelsum: 20.1977
  • Gen Len: 41.2712

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.1653 0.1 1500 0.1176 19.8446 7.4011 19.8446 19.9153 41.2712
0.2083 0.2 3000 0.1023 19.7034 8.7571 19.7034 19.774 41.1186
0.1617 0.31 4500 0.0975 19.2797 7.9096 19.2797 19.209 41.2797
0.17 0.41 6000 0.0949 20.5508 8.7571 20.5862 20.6215 41.2712
0.1416 0.51 7500 0.0871 20.0565 7.9096 20.1271 20.1977 41.1017
0.1409 0.61 9000 0.0807 20.0565 7.9096 20.1271 20.1977 41.1695
0.1094 0.72 10500 0.0746 19.9859 7.6271 19.9506 19.9859 41.2627
0.1256 0.82 12000 0.0754 19.9859 7.6271 19.9506 19.9859 41.2119
0.1206 0.92 13500 0.0771 20.0565 7.9096 20.1271 20.1977 41.2712

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.13.1+cu116
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
9