Edit model card


Facebook's Wav2Vec2 large model pretrained on the 100k unlabeled subset of VoxPopuli corpus.

Note: This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model speech recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data. Check out this blog for more in-detail explanation of how to fine-tune the model.

Paper: VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Authors: Changhan Wang, Morgane Riviere, Ann Lee, Anne Wu, Chaitanya Talnikar, Daniel Haziza, Mary Williamson, Juan Pino, Emmanuel Dupoux from Facebook AI

See the official website for more information, here


Please refer to this blog on how to fine-tune this model on a specific language. Note that you should replace "facebook/wav2vec2-large-xlsr-53" with this checkpoint for fine-tuning.

Downloads last month
Hosted inference API
or or
This model can be loaded on the Inference API on-demand.