metadata

library_name: transformers
language:
  - th
license: apache-2.0
base_model: openai/whisper-medium
tags:
  - asr
  - speech-recognition
  - thai
  - custom-model
  - fine-tuning
  - Common Voice
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: Whisper Medium TH - Custom datasets and Common voice 17
    results: []

Whisper Medium TH - Custom datasets and Common voice 17

This model is a fine-tuned version of openai/whisper-medium on the mozilla-foundation/common_voice_17_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.1365
Wer: 59.1432

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 3000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.267	0.2405	500	0.2401	77.2512
0.1975	0.4810	1000	0.1865	68.8672
0.1935	0.7215	1500	0.1659	64.4477
0.1591	0.9620	2000	0.1477	63.3167
0.0821	1.2025	2500	0.1431	60.2557
0.0762	1.4430	3000	0.1365	59.1432

Framework versions

Transformers 4.45.2
Pytorch 2.5.1+cu121
Datasets 3.2.0
Tokenizers 0.20.3