Full Text Search - Hugging Face

Full-text search

models datasets spaces

+ 1,000 results

facebook / wav2vec2-base-960h

README.md

model

29 matches

tags: transformers, pytorch, tf, safetensors, wav2vec2, automatic-speech-recognition, audio, hf-asr-leaderboard, en, dataset:librispeech_asr, arxiv:2006.11477, license:apache-2.0, model-index, endpoints_compatible, has_space, region:us

# Wav2Vec2-Base-960h

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/)

The base model pretrained and fine-tuned on 960 hours of Librispeech on 16kHz sampled speech audio. When using the model

facebook / wav2vec2-base

README.md

model

11 matches

tags: transformers, pytorch, wav2vec2, pretraining, speech, en, dataset:librispeech_asr, arxiv:2006.11477, license:apache-2.0, endpoints_compatible, has_space, region:us

# Wav2Vec2-Base

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/)

The base model pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-fi-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, fi, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **fi** on **14.2k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-fr-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, fr, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **fr** on **22.8k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-fr-voxpopuli

README.md

model

10 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli, fr, arxiv:2101.00390, license:cc-by-nc-4.0, endpoints_compatible, region:us

# Wav2Vec2-Base-VoxPopuli

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained on the fr unlabeled subset of [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

**Paper**: *[VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation

facebook / wav2vec2-base-hr-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, hr, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **hr** on **8.1k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-hu-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, hu, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **hu** on **17.7k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-it-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, it, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **it** on **21.9k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-it-voxpopuli

README.md

model

10 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli, it, arxiv:2101.00390, license:cc-by-nc-4.0, endpoints_compatible, region:us

# Wav2Vec2-Base-VoxPopuli

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained on the it unlabeled subset of [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

**Paper**: *[VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation

facebook / wav2vec2-base-lt-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, lt, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **lt** on **14.4k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-lv-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, lv, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **lv** on **13.1k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-mt-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, mt, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **mt** on **9.1k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-nl-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, nl, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **nl** on **19.0k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-nl-voxpopuli

README.md

model

10 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli, nl, arxiv:2101.00390, license:cc-by-nc-4.0, endpoints_compatible, region:us

# Wav2Vec2-Base-VoxPopuli

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained on the nl unlabeled subset of [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

**Paper**: *[VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation

facebook / wav2vec2-base-pl-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, pl, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **pl** on **21.2k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-pt-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, pt, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **pt** on **17.5k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-ro-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, ro, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **ro** on **17.9k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-sk-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, sk, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **sk** on **12.1k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-sl-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, sl, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **sl** on **11.3k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

facebook / wav2vec2-base-sv-voxpopuli-v2

README.md

model

8 matches

tags: transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, sv, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us

# Wav2Vec2-base-VoxPopuli-V2

[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **sv** on **16.3k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.