whisper-large-v3-ft-cv-cy-en

This model is a fine-tuned version of openai/whisper-large-v3 on the techiaith/commonvoice_18_0_cy_en dataset. Both the English and Welsh data have been used to fine-tune the whisper model for transcribing both languages as well as improved language detection.

It achieves a success rate of 98.86% for language detection on recordings from a Common Voice bilingual test set

While, it achieves the following WER results for transcribing using the same test set:

  • Welsh: 26.20
  • English: 15.37
  • Average: 20.70

N.B. the desired transcript language is not given to the fine-tuned model during testing.

Usage

from transformers import pipeline

transcriber = pipeline("automatic-speech-recognition", model="techiaith/whisper-large-v3-ft-cv-cy-en")
result = transcriber(<path or url to soundfile>)
print (result)

{'text': 'Mae hen wlad fy nhadau yn annwyl i mi.'}

Downloads last month
334
Safetensors
Model size
1.54B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for techiaith/whisper-large-v3-ft-cv-cy-en

Finetuned
(309)
this model

Dataset used to train techiaith/whisper-large-v3-ft-cv-cy-en

Collection including techiaith/whisper-large-v3-ft-cv-cy-en