Edit model card

speller-t5-909

This model is a fine-tuned version of sberbank-ai/ruT5-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0814
  • Rouge1: 18.2203
  • Rouge2: 5.9322
  • Rougel: 17.7966
  • Rougelsum: 18.2203
  • Gen Len: 42.0424

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.3022 0.1 1500 0.1563 18.2203 5.9322 17.7966 18.2203 43.4492
0.2274 0.2 3000 0.1311 18.2203 5.9322 17.7966 18.2203 42.3814
0.2001 0.31 4500 0.1128 18.2203 5.9322 17.7966 18.2203 41.9407
0.1757 0.41 6000 0.1063 18.2203 5.9322 17.7966 18.2203 42.2542
0.1612 0.51 7500 0.1002 17.9379 5.0847 17.5141 17.7966 42.339
0.1718 0.61 9000 0.0921 18.2203 5.9322 17.7966 18.2203 42.0508
0.1678 0.72 10500 0.0834 17.7966 5.0847 17.3729 17.7966 41.9831
0.1407 0.82 12000 0.0793 18.2203 5.9322 17.7966 18.2203 42.2119
0.1447 0.92 13500 0.0814 18.2203 5.9322 17.7966 18.2203 42.0424

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.13.1+cu116
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
9