Edit model card

spell_corrector_bert2bert_1809_v3

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0003
  • Bleu: 65.6869
  • Gen Len: 16.0563

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 488 0.3069 51.7893 16.1497
0.9648 2.0 976 0.1273 60.1518 16.0477
0.4398 3.0 1464 0.0652 63.3763 16.0624
0.2445 4.0 1952 0.0458 64.0081 16.0741
0.1417 5.0 2440 0.0270 64.8793 16.057
0.0958 6.0 2928 0.0230 65.1021 16.0524
0.0667 7.0 3416 0.0152 65.3175 16.0559
0.051 8.0 3904 0.0123 65.4032 16.0619
0.0382 9.0 4392 0.0079 65.4965 16.0555
0.028 10.0 4880 0.0058 65.5698 16.0577
0.0192 11.0 5368 0.0036 65.6254 16.0565
0.0143 12.0 5856 0.0020 65.6452 16.0581
0.0109 13.0 6344 0.0009 65.679 16.0563
0.0075 14.0 6832 0.0004 65.6856 16.0563
0.0052 15.0 7320 0.0003 65.6869 16.0563

Framework versions

  • Transformers 4.33.2
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month
0