xlsr300m_cv_8.0_nl

Evaluation Commands

To evaluate on mozilla-foundation/common_voice_8_0 with split test

python eval.py --model_id Iskaj/xlsr300m_cv_8.0_nl --dataset mozilla-foundation/common_voice_8_0 --config nl --split test

To evaluate on speech-recognition-community-v2/dev_data

python eval.py --model_id Iskaj/xlsr300m_cv_8.0_nl --dataset speech-recognition-community-v2/dev_data --config nl --split validation --chunk_length_s 5.0 --stride_length_s 1.0

Inference

import torch
from datasets import load_dataset
from transformers import AutoModelForCTC, AutoProcessor
import torchaudio.functional as F

model_id = "Iskaj/xlsr300m_cv_8.0_nl"

sample_iter = iter(load_dataset("mozilla-foundation/common_voice_8_0", "nl", split="test", streaming=True, use_auth_token=True))

sample = next(sample_iter)
resampled_audio = F.resample(torch.tensor(sample["audio"]["array"]), 48_000, 16_000).numpy()

model = AutoModelForCTC.from_pretrained(model_id)
processor = AutoProcessor.from_pretrained(model_id)

inputs = processor(resampled_audio, sampling_rate=16_000, return_tensors="pt")
with torch.no_grad():
  logits = model(**inputs).logits
  predicted_ids = torch.argmax(logits, dim=-1)
  transcription = processor.batch_decode(predicted_ids)

transcription[0].lower()
#'het kontine schip lag aangemeert in de aven'

Iskaj
/

xlsr300m_cv_8.0_nl

xlsr300m_cv_8.0_nl

Evaluation Commands

Inference

Dataset used to train Iskaj/xlsr300m_cv_8.0_nl

Evaluation results