---
license: apache-2.0
base_model: ntu-spml/distilhubert
tags:
- generated_from_trainer
datasets:
- mozilla-foundation/common_voice_13_0
language:
- vi
metrics:
- wer
pipeline_tag: automatic-speech-recognition
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# whisper-small-vi

This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the [mozilla-foundation/common_voice_13_0](https://huggingface.co/datasets/mozilla-foundation/common_voice_13_0) dataset.
It achieves the following results on the test set:
- Loss:  0.5336
- WER: 0.2220 (36% improvement over pretrained WER, 0.3447)
- WER_ortho: 0.2696 (38% improvement over pretrained WER_ortho, 0.4324)
## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed


### Framework versions

- Transformers 4.32.1
- Pytorch 2.1.2
- Datasets 2.16.1
- Tokenizers 0.13.2