speller-t5-8 / README.md
summervent's picture
update model card README.md
af86f69
metadata
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: speller-t5-8
    results: []

speller-t5-8

This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1626
  • Rouge1: 14.5345
  • Rouge2: 9.6847
  • Rougel: 14.4144
  • Rougelsum: 14.3544
  • Gen Len: 37.7748

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.9929 0.04 500 0.5644 12.8915 6.982 12.7499 12.6126 39.3604
1.2778 0.07 1000 0.4272 13.8323 8.1081 13.7538 13.6059 38.8288
0.6384 0.11 1500 0.3654 14.1341 8.5586 14.034 13.9139 38.4865
0.5993 0.15 2000 0.3318 14.009 8.3655 13.9339 13.7688 38.8018
0.5327 0.18 2500 0.2969 14.2342 8.7838 14.1141 13.9339 37.7928
0.4831 0.22 3000 0.2761 14.5345 9.6847 14.4144 14.3544 38.9369
0.4582 0.25 3500 0.2562 14.5345 9.6847 14.4144 14.3544 39.2162
0.3983 0.29 4000 0.2449 14.5345 9.6847 14.4144 14.3544 38.9459
0.4459 0.33 4500 0.2422 14.4717 9.5045 14.4144 14.2206 38.8378
0.4073 0.36 5000 0.2375 14.5345 9.6847 14.4144 14.3544 38.3964
0.4047 0.4 5500 0.2263 14.5345 9.6847 14.4144 14.3544 39.2703
0.3423 0.44 6000 0.2208 14.5345 9.6847 14.4144 14.3544 38.5405
0.3348 0.47 6500 0.2109 14.5345 9.6847 14.4144 14.3544 37.7568
0.3421 0.51 7000 0.2053 14.5345 9.6847 14.4144 14.3544 37.7117
0.3319 0.54 7500 0.2025 14.2342 8.7838 14.1141 14.0541 37.5586
0.3239 0.58 8000 0.1991 14.2342 8.7838 14.1141 14.0541 38.0541
0.2963 0.62 8500 0.1959 14.2342 8.7838 14.1141 14.0541 38.0
0.3117 0.65 9000 0.1899 14.2342 8.7838 14.1141 14.0541 37.7117
0.2737 0.69 9500 0.1898 14.2342 8.7838 14.1141 14.0541 37.5135
0.3425 0.73 10000 0.1830 14.2342 8.7838 14.1141 14.0541 37.6306
0.2986 0.76 10500 0.1831 14.5345 9.6847 14.4144 14.3544 37.7658
0.3312 0.8 11000 0.1734 14.2342 8.7838 14.1141 14.0541 37.973
0.3461 0.83 11500 0.1753 14.5345 9.6847 14.4144 14.3544 37.6847
0.2786 0.87 12000 0.1740 14.5345 9.6847 14.4144 14.3544 37.7748
0.2911 0.91 12500 0.1672 14.5345 9.6847 14.4144 14.3544 37.7387
0.2618 0.94 13000 0.1691 14.5345 9.6847 14.4144 14.3544 37.5135
0.2844 0.98 13500 0.1626 14.5345 9.6847 14.4144 14.3544 37.7748

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.13.1+cu116
  • Datasets 2.9.0
  • Tokenizers 0.13.2