CreaTiv Team (CTT): Dendi Numerals Automatic Speech Recognition
This repository contains an Automatic Speech Recognition (ASR) model specifically for recognizing numerals in the Dendi (ddn) language. The model can accurately recognize numbers ranging from 0 to 1,000,000,000 when spoken in Dendi.
This model is part of Creativ Team's Noulinmon project, a user-friendly mobile app designed to make calculations accessible in six local languages of Benin, featuring voice reading and AI capabilities. You can find more CTT-ASR models on the Hugging Face Hub: ssid32/ctt-asr.
CTT-ASR is available in the 🤗 Transformers library from version 4.4 onwards.
Model Details
The model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on Dendi. When using this model, make sure that your speech input is sampled at 16kHz.
Usage
To use this model, first install the latest version of 🤗 Transformers library:
pip install --upgrade transformers accelerate
Then, run inference with the following code-snippet:
import torch
import torchaudio
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
processor = Wav2Vec2Processor.from_pretrained("ssid32/wav2vec2-xlsr-dendi-ddn-for-numerals")
model = Wav2Vec2ForCTC.from_pretrained("ssid32/wav2vec2-xlsr-dendi-ddn-for-numerals")
speech_array, sampling_rate = torchaudio.load("audio_test.wav")
speech_array = speech_array.squeeze().numpy()
inputs = processor(speech_array, sampling_rate=16_000, return_tensors="pt", padding=True)
with torch.no_grad():
logits = model(inputs.input_values, attention_mask=inputs.attention_mask).logits
output = processor.batch_decode(torch.argmax(logits, dim=-1))
print("Output:", output)
You can listen to the sample audio here:
Upon processing the sample audio, the model produces the following output:
Output: ['zangu ihaaku nda weiguu']
In this case, the output represents the numeral 850 in the Dendi language.
Evaluation result
The model's performance on a test set yields a Word Error Rate (WER) of 18.18%.
Authors
This model was developed by:
- Salim KORA GUERA (HuggingFace Username: ssid32) | (koravant1@gmail.com)
- Etienne TOVIMAFA (HuggingFace Username: MrBendji) | (abiodouneti@gmail.com)
Citation
@misc {
author = { {Salim KORA GUERA and Etienne TOVIMAFA} },
title = { wav2vec2-xlsr-dendi-ddn-for-numerals },
year = 2024,
url = { https://huggingface.co/ssid32/wav2vec2-xlsr-dendi-ddn-for-numerals },
doi = { 10.57967/hf/2930 },
publisher = { Hugging Face }
}
License
The model is licensed as CC-BY-NC 4.0.
- Downloads last month
- 13