|
--- |
|
language: ary |
|
metrics: |
|
- wer |
|
tags: |
|
- audio |
|
- automatic-speech-recognition |
|
- speech |
|
- xlsr-fine-tuning-week |
|
license: apache-2.0 |
|
model-index: |
|
- name: XLSR Wav2Vec2 Moroccan Arabic dialect by Boumehdi |
|
results: |
|
- task: |
|
name: Speech Recognition |
|
type: automatic-speech-recognition |
|
metrics: |
|
- name: Test WER |
|
type: wer |
|
value: 0.09 |
|
--- |
|
# Wav2Vec2-Large-XLSR-53-Moroccan-Darija |
|
|
|
**wav2vec2-large-xlsr-53** fine-tuned on 120 hours of labeled Darija Audios |
|
|
|
## Usage |
|
|
|
The model can be used directly as follows: |
|
|
|
```python |
|
import librosa |
|
import torch |
|
from transformers import Wav2Vec2CTCTokenizer, Wav2Vec2ForCTC, Wav2Vec2Processor, TrainingArguments, Wav2Vec2FeatureExtractor, Trainer |
|
|
|
tokenizer = Wav2Vec2CTCTokenizer("./vocab.json", unk_token="[UNK]", pad_token="[PAD]", word_delimiter_token="|") |
|
processor = Wav2Vec2Processor.from_pretrained('boumehdi/wav2vec2-large-xlsr-moroccan-darija', tokenizer=tokenizer) |
|
model=Wav2Vec2ForCTC.from_pretrained('boumehdi/wav2vec2-large-xlsr-moroccan-darija') |
|
|
|
|
|
# load the audio data (use your own wav file here!) |
|
input_audio, sr = librosa.load('file.wav', sr=16000) |
|
|
|
# tokenize |
|
input_values = processor(input_audio, return_tensors="pt", padding=True).input_values |
|
|
|
# retrieve logits |
|
logits = model(input_values).logits |
|
|
|
tokens=torch.argmax(logits, axis=-1) |
|
|
|
# decode using n-gram |
|
transcription = tokenizer.batch_decode(tokens) |
|
|
|
# print the output |
|
print(transcription) |
|
``` |
|
|
|
Here's the output: قالت ليا هاد السيد هادا ما كاينش بحالو |
|
|
|
email: souregh@gmail.com |
|
|