Edit model card

whisper-small-ko

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3342
  • Cer: 9.3562
  • Wer: 23.5998

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 25000
  • mixed_precision_training: Native AMP

Training results

Training Loss Validation Loss Step Epoch Cer Wer
0.8798 0.4962 500 0.2781 12.9471 32.0342
0.6216 0.4342 1000 0.5562 12.3301 30.0835
0.5545 0.4136 1500 0.8343 11.6154 28.6587
0.5038 0.3996 2000 1.1123 11.2510 27.8611
0.4938 0.3895 2500 1.3904 10.9614 27.1911
0.472 0.3840 3000 1.6685 10.7938 26.8655
0.46 0.3792 3500 1.9466 10.7587 26.7165
0.4475 0.3754 4000 2.2247 10.6994 26.5890
0.4396 0.3725 4500 2.5028 10.5686 26.3647
0.4335 0.3729 5000 2.7809 10.5336 26.2592
0.429 0.3726 5500 3.0590 10.4199 26.1158
0.424 0.3675 6000 3.3370 10.3781 25.9293
0.4191 0.3644 6500 3.6151 10.2905 25.7397
0.4115 0.3616 7000 3.8932 10.2256 25.5453
0.3951 0.3613 7500 4.1713 10.2008 25.4925
0.4018 0.3589 8000 4.4494 10.0662 25.2825
0.3957 0.3568 8500 4.7275 10.0739 25.2567
0.3944 0.3558 9000 5.0056 10.0189 25.1651
0.3879 0.3557 9500 5.2836 9.9843 25.1215
0.3849 0.3555 10000 5.5617 9.9827 25.0944
0.3875 0.3555 10500 5.8398 9.9046 25.0302
0.3787 0.3535 11000 6.1179 9.8685 24.8701
0.3751 0.3523 11500 6.3960 9.8697 24.8279
0.3765 0.3500 12000 6.6741 9.8013 24.6271
0.3718 0.3467 12500 6.9522 9.7820 24.5162
0.3574 0.3466 13000 7.2303 9.6958 24.4707
0.357 0.3439 13500 7.5083 9.6296 24.2872
0.3575 0.3420 14000 7.7864 9.6123 24.2847
0.3437 0.3414 14500 8.0645 9.5519 24.1108
0.3532 0.3410 15000 8.3426 9.4921 24.0301
0.345 0.3404 15500 8.6207 9.5022 23.9979
0.3456 0.3391 16000 8.8988 9.5080 23.9553
0.334 0.3394 16500 9.1769 9.4458 23.8505
0.3394 0.3378 17000 9.4549 9.4214 23.8336
0.3383 0.3371 17500 9.7330 9.4591 23.8312
0.3353 0.3370 18000 10.0111 9.3743 23.7401
0.3337 0.3366 18500 10.2892 9.3624 23.7240
0.336 0.3361 19000 10.5673 9.4365 23.7595
0.3287 0.3359 19500 10.8454 9.3547 23.6841
0.3266 0.3358 20000 11.1235 9.3734 23.6911
0.3316 0.3357 20500 11.4016 0.3357 23.6929
0.3314 0.3351 21000 11.6796 0.3351 23.6263
0.3315 0.3349 21500 11.9577 0.3349 23.5889
0.3268 0.3348 22000 12.2358 0.3348 23.5430
0.3263 0.3350 22500 12.5139 0.3350 23.5920
0.325 0.3342 23000 12.7920 0.3342 23.5998

Framework versions

  • Transformers 4.42.0.dev0
  • Pytorch 2.3.1+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
31
Safetensors
Model size
242M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.