Edit model card
YAML Metadata Error: "language" with value "uralic" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.

Wav2Vec2-large-VoxPopuli-V2

Facebook's Wav2Vec2 large model pretrained only in uralic on 42.5 unlabeled datat of the VoxPopuli corpus.

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

Note: This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model for speech recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data in uralic. Check out this blog for a more in-detail explanation of how to fine-tune the model.

Paper: VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Authors: Changhan Wang, Morgane Riviere, Ann Lee, Anne Wu, Chaitanya Talnikar, Daniel Haziza, Mary Williamson, Juan Pino, Emmanuel Dupoux from Facebook AI.

See the official website for more information, here.

Downloads last month
2
Hosted inference API

Inference API has been turned off for this model.