Edit model card

spell_corrector_mt5_01012024_v2_inbalanced_mistakes

This model is a fine-tuned version of Buseak/spell_corrector_mt5_01012024 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4841
  • Bleu: 35.9177
  • Gen Len: 15.7215

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.1495 1.0 976 0.6422 32.0204 15.7943
1.0698 2.0 1952 0.6209 32.5167 15.7969
0.9987 3.0 2928 0.5965 33.0715 15.7826
0.9755 4.0 3904 0.5770 33.4511 15.7809
0.949 5.0 4880 0.5583 34.2697 15.7524
0.9232 6.0 5856 0.5379 34.6321 15.7416
0.9036 7.0 6832 0.5254 34.9265 15.7377
0.8923 8.0 7808 0.5141 35.161 15.7364
0.8771 9.0 8784 0.5077 35.3906 15.7275
0.8675 10.0 9760 0.5017 35.5138 15.7251
0.8517 11.0 10736 0.4940 35.7429 15.7199
0.8623 12.0 11712 0.4915 35.7791 15.7264
0.8569 13.0 12688 0.4881 35.8203 15.7232
0.8544 14.0 13664 0.4860 35.9037 15.7224
0.8657 15.0 14640 0.4841 35.9177 15.7215

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
4
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Buseak/spell_corrector_mt5_01012024_v2_inbalanced_mistakes

Base model

google/mt5-small
Finetuned
(1)
this model