sugafree
/

whisper-medium-hu

Automatic Speech Recognition

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

Whisper Medium HU

This model is a fine-tuned version of openai/whisper-medium on the Common Voice 13 dataset. It achieves the following results on the evaluation set:

Loss: 0.2699
Wer Ortho: 17.1763
Wer: 14.8290

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_steps: 50
training_steps: 20000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer Ortho	Wer
0.0804	1.38	2000	0.1977	19.2869	16.6612
0.038	2.76	4000	0.2028	18.2211	15.7494
0.014	4.14	6000	0.2190	17.9961	15.3466
0.0107	5.51	8000	0.2328	17.3490	14.9370
0.0144	6.89	10000	0.2376	17.4153	14.9559
0.0049	8.27	12000	0.2424	16.9984	14.6953
0.0071	9.65	14000	0.2594	17.6961	15.3586
0.0037	11.03	16000	0.2546	17.2007	14.8667
0.0078	12.41	18000	0.2644	17.5757	15.1495
0.0043	13.78	20000	0.2699	17.1763	14.8290

Framework versions

Transformers 4.37.2
Pytorch 2.2.0
Datasets 2.17.0
Tokenizers 0.15.2

Downloads last month: 4

Safetensors

Model size

764M params

Tensor type

F32

·

Inference Providers NEW

Automatic Speech Recognition

This model is not currently available via any of the supported Inference Providers.

Model tree for sugafree/whisper-medium-hu

Base model

openai/whisper-medium

Finetuned

(570)

this model

Dataset used to train sugafree/whisper-medium-hu

Evaluation results

Wer on Common Voice 13
test set self-reported

14.829

View on Papers With Code