Edit model card

spell_corrector_bert2bert_cased_1010_v3

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0001
  • Bleu: 72.1549
  • Gen Len: 15.509

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 488 0.0488 70.4997 15.5281
0.1586 2.0 976 0.0438 70.9811 15.4925
0.0868 3.0 1464 0.0342 71.2514 15.5172
0.0736 4.0 1952 0.0283 71.4007 15.4953
0.0532 5.0 2440 0.0241 71.5523 15.503
0.043 6.0 2928 0.0172 71.7499 15.5107
0.0351 7.0 3416 0.0144 71.7684 15.4988
0.029 8.0 3904 0.0117 71.8266 15.5092
0.0219 9.0 4392 0.0091 71.9555 15.508
0.0187 10.0 4880 0.0080 71.9537 15.5095
0.0147 11.0 5368 0.0057 72.0262 15.5088
0.0117 12.0 5856 0.0036 72.08 15.5087
0.0107 13.0 6344 0.0029 72.1015 15.5084
0.0073 14.0 6832 0.0023 72.1016 15.5078
0.0052 15.0 7320 0.0017 72.1217 15.5099
0.0049 16.0 7808 0.0011 72.1201 15.5083
0.0029 17.0 8296 0.0002 72.1486 15.509
0.0018 18.0 8784 0.0003 72.1529 15.509
0.0016 19.0 9272 0.0001 72.1549 15.509
0.0012 20.0 9760 0.0001 72.1549 15.509

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.14.1
Downloads last month
0