This model was pre-trained on 2,000 hours of 16kHz sampled Arabic speech audio. When using the model make sure that your speech input is also sampled at 16Khz. Paper.
Note: This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model speech recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data.
See this blog for more information on how to fine-tune the model. Note that the class
Wav2Vec2ForCTC has to be replaced by
This work is licensed under CC BY-NC-4.0.
- Downloads last month