Edit model card

whisper-small-hi-cv

This model is a fine-tuned version of openai/whisper-small on the Common Voice 15 dataset. It achieves the following results on the evaluation set:

  • Wer: 14.0178
  • Cer: 05.8824

Evaluation

from datasets import load_dataset,load_metric,Audio
from transformers import WhisperForConditionalGeneration, WhisperProcessor
import torch
import torchaudio

test_dataset = load_dataset("mozilla-foundation/common_voice_13_0", "hi", split="test")
wer = load_metric("wer")
cer = load_metric("cer")

processor = WhisperProcessor.from_pretrained("SakshiRathi77/whisper-hindi-kagglex")
model = WhisperForConditionalGeneration.from_pretrained("SakshiRathi77/whisper-hindi-kagglex").to("cuda")
test_dataset = test_dataset.cast_column("audio", Audio(sampling_rate=16000))

def map_to_pred(batch):
    audio = batch["audio"]
    input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
    batch["reference"] = processor.tokenizer._normalize(batch['sentence'])

    with torch.no_grad():
        predicted_ids = model.generate(input_features.to("cuda"))[0]
    transcription = processor.decode(predicted_ids)
    batch["prediction"] = processor.tokenizer._normalize(transcription)
    return batch

result = test_dataset.map(map_to_pred)

print("WER: {:2f}".format(100 * wer.compute(predictions=result["prediction"], references=result["reference"])))
print("CER: {:2f}".format(100 * cer.compute(predictions=result["prediction"], references=result["reference"])))
WER: 23.1361
CER: 10.4366
Downloads last month
3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Datasets used to train SakshiRathi77/whisper-hindi-kagglex

Evaluation results