facebook/wav2vec2-xls-r-300m fine-tuned on openslr (SLR86) and mozilla-foundation/common_voice_12_0 for Yoruba language.

WER: 0.51

from huggingsound import SpeechRecognitionModel

model = SpeechRecognitionModel("AstralZander/yoruba_ASR")
audio_paths = [audio_path] # List with paths to audio
transcriptions = model.transcribe(audio_paths)

transcriptions # List of transcriptions, timestamps and probabilities
transcriptions[ind_audio]['transcription'] # Transcription of audio with the ind_audio index from the audio_paths list
Downloads last month
23
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train AstralZander/yoruba_ASR