Edit model card
YAML Metadata Error: "datasets[0]" with value "https://arabicspeech.org/" is not valid. If possible, use a dataset id from https://hf.co/datasets.
YAML Metadata Error: "model-index[0].results[0].dataset.type" with value "arabicspeech.org MGB-3" fails to match the required pattern: /^(?:[\w-]+\/)?[\w-.]+$/

Test Wav2Vec2 with egyptian arabic

Fine-tuned facebook/wav2vec2-large-xlsr-53 in Egyptian using the arabicspeech.org MGB-3 When using this model, make sure that your speech input is sampled at 16kHz.

Usage

The model can be used directly (without a language model) as follows:

import torch
import torchaudio
from datasets import load_dataset
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
dataset = load_dataset("arabic_speech_corpus", split="test")
processor = Wav2Vec2Processor.from_pretrained("othrif/wav2vec_test")
model = Wav2Vec2ForCTC.from_pretrained("othrif/wav2vec_test")
resampler = torchaudio.transforms.Resample(48_000, 16_000)
# Preprocessing the datasets.
# We need to read the aduio files as arrays
def speech_file_to_array_fn(batch):
\\tspeech_array, sampling_rate = torchaudio.load(batch["path"])
\\tbatch["speech"] = resampler(speech_array).squeeze().numpy()
\\treturn batch
test_dataset = test_dataset.map(speech_file_to_array_fn)
inputs = processor(test_dataset["speech"][:2], sampling_rate=16_000, return_tensors="pt", padding=True)
with torch.no_grad():
\\tlogits = model(inputs.input_values, attention_mask=inputs.attention_mask).logits
predicted_ids = torch.argmax(logits, dim=-1)
print("Prediction:", processor.batch_decode(predicted_ids))
print("Reference:", test_dataset["sentence"][:2])
Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results

Model card error

This model's model-index metadata is invalid: Schema validation error. "model-index[0].results[0].dataset.type" with value "arabicspeech.org MGB-3" fails to match the required pattern: /^(?:[\w-]+\/)?[\w-.]+$/