metadata

title: Ukrainian Speech-to-Text
emoji: 🐌
colorFrom: blue
colorTo: yellow
sdk: gradio
app_file: app.py
pinned: false

🇺🇦🎤 Voice recognition for Ukrainian language

This is a repository with aim to apply various speech recognition models on Ukrainian language.

You can see online demo here: https://huggingface.co/spaces/robinhad/ukrainian-stt.
Source code is in this repository together with auto-deploy pipeline scripts.

🧮 Models

Model name	CER	WER	License	Note
Wav2Vec2	6,01%	27,99%	MIT	Common Voice 8 dataset, `test` set used as validation
DeepSpeech with Wiki LM	12%	30,65%	CC-BY-NC 4.0	Common Voice 6 dataset
DeepSpeech	16%	57%	CC-BY-NC 4.0	Common Voice 6 dataset

If you'd like to check out different models for Ukrainian language, please visit https://github.com/egorsmkv/speech-recognition-uk.

Guides for training are available in corresponding folders for each model.

@robinhad - model training. @egorsmkv - organized Ukrainian Speech recognition community.
@tarasfrompir - created synthetic 1200h Ukrainian Speech-to-Text dataset.
@AlexeyBoiler - hosted Ukrainian Speech-to-Text dataset.