Edit model card

Arabic Hubert-Large

This model was pre-trained on 2,000 hours of 16kHz sampled Arabic speech audio. When using the model make sure that your speech input is also sampled at 16Khz. Paper.

Training of this mode was performed using fairseq. Tensorboard logs of the training can be found here

Note: This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model speech recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data.


See this blog for more information on how to fine-tune the model. Note that the class Wav2Vec2ForCTC has to be replaced by HubertForCTC.


This work is licensed under CC BY-NC-4.0.


Model pre-training and data processing for in this work partially performed at KUIS AI Center Cluster, and TUBITAK ULAKBIM Cluster (TRUBA resources).

Downloads last month
Hosted inference API
This model can be loaded on the Inference API on-demand.