iteboshi

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1519
  • Wer: 96.7751
  • Cer: 49.6435

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 12
  • eval_batch_size: 12
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 48
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 10000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
1.8339 1.6507 1000 1.9115 99.6794 93.6205
0.9948 3.3006 2000 1.2763 97.3503 59.4213
0.7577 4.9513 3000 1.1085 96.6431 53.3468
0.5464 6.6012 4000 1.0575 95.4927 48.2507
0.4182 8.2510 5000 1.0574 96.2376 47.2929
0.3164 9.9017 6000 1.0616 96.3885 49.4417
0.2319 11.5516 7000 1.0929 96.2565 49.5535
0.1899 13.2015 8000 1.1223 97.2749 48.6737
0.1425 14.8522 9000 1.1422 96.6148 48.4484
0.161 16.5021 10000 1.1519 96.7751 49.6435

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Safetensors
Model size
55.8M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support