metadata

datasets:
  - mozilla-foundation/common_voice_16_0
language:
  - hu
widget:
  - example_title: Sample 1
    src: >-
      https://huggingface.co/datasets/Hungarians/samples/resolve/main/Sample1.flac
  - example_title: Sample 2
    src: >-
      https://huggingface.co/datasets/Hungarians/samples/resolve/main/Sample2.flac
license: apache-2.0
base_model: openai/whisper-base
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: Whisper Base Hu v2
    results: []

Whisper Base Hu v2

This model is a fine-tuned version of openai/whisper-base on the Common Voice 16.0 dataset. It achieves the following results on the evaluation set:

Loss: 0.1599
Wer Ortho: 12.6641
Wer: 11.4171

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2.75e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_steps: 500
training_steps: 15000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer Ortho	Wer
0.199	0.33	1000	0.3838	36.7548	33.5517
0.3037	0.67	2000	0.3131	31.2748	28.3664
0.221	1.0	3000	0.2546	27.1739	24.1773
0.1562	1.34	4000	0.2319	23.9341	21.3341
0.1623	1.67	5000	0.2101	21.4079	18.9623
0.077	2.01	6000	0.1818	18.5415	16.2852
0.078	2.34	7000	0.1846	17.8339	15.7456
0.0818	2.68	8000	0.1712	16.4669	14.5983
0.0352	3.01	9000	0.1669	15.6178	14.0676
0.0413	3.35	10000	0.1673	14.9464	13.4539
0.0454	3.68	11000	0.1649	14.5459	12.7542
0.0225	4.02	12000	0.1589	13.5885	12.2087
0.0269	4.35	13000	0.1638	14.3864	12.8343
0.0299	4.69	14000	0.1621	13.0555	11.7610
0.0171	5.02	15000	0.1599	12.6641	11.4171

Framework versions

Transformers 4.36.2
Pytorch 2.1.0+cu121
Datasets 2.16.1
Tokenizers 0.15.0