Full-text search
+ 1,000 results
facebook / wav2vec2-base-960h
README.md
model
29 matches
tags:
transformers, pytorch, tf, safetensors, wav2vec2, automatic-speech-recognition, audio, hf-asr-leaderboard, en, dataset:librispeech_asr, arxiv:2006.11477, license:apache-2.0, model-index, endpoints_compatible, has_space, region:us
48
49
50
51
52
# Wav2Vec2-Base-960h
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/)
The base model pretrained and fine-tuned on 960 hours of Librispeech on 16kHz sampled speech audio. When using the model
facebook / wav2vec2-base
README.md
model
11 matches
tags:
transformers, pytorch, wav2vec2, pretraining, speech, en, dataset:librispeech_asr, arxiv:2006.11477, license:apache-2.0, endpoints_compatible, has_space, region:us
10
11
12
13
14
# Wav2Vec2-Base
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/)
The base model pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-fi-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, fi, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **fi** on **14.2k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-fr-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, fr, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **fr** on **22.8k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-fr-voxpopuli
README.md
model
10 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli, fr, arxiv:2101.00390, license:cc-by-nc-4.0, endpoints_compatible, region:us
10
11
12
13
14
# Wav2Vec2-Base-VoxPopuli
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained on the fr unlabeled subset of [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
**Paper**: *[VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation
facebook / wav2vec2-base-hr-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, hr, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **hr** on **8.1k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-hu-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, hu, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **hu** on **17.7k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-it-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, it, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **it** on **21.9k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-it-voxpopuli
README.md
model
10 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli, it, arxiv:2101.00390, license:cc-by-nc-4.0, endpoints_compatible, region:us
10
11
12
13
14
# Wav2Vec2-Base-VoxPopuli
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained on the it unlabeled subset of [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
**Paper**: *[VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation
facebook / wav2vec2-base-lt-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, lt, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **lt** on **14.4k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-lv-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, lv, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **lv** on **13.1k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-mt-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, mt, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **mt** on **9.1k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-nl-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, nl, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **nl** on **19.0k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-nl-voxpopuli
README.md
model
10 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli, nl, arxiv:2101.00390, license:cc-by-nc-4.0, endpoints_compatible, region:us
10
11
12
13
14
# Wav2Vec2-Base-VoxPopuli
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained on the nl unlabeled subset of [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
**Paper**: *[VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation
facebook / wav2vec2-base-pl-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, pl, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **pl** on **21.2k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-pt-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, pt, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **pt** on **17.5k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-ro-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, ro, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **ro** on **17.9k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-sk-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, sk, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **sk** on **12.1k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-sl-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, sl, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **sl** on **11.3k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
facebook / wav2vec2-base-sv-voxpopuli-v2
README.md
model
8 matches
tags:
transformers, pytorch, wav2vec2, pretraining, audio, automatic-speech-recognition, voxpopuli-v2, sv, dataset:voxpopuli, arxiv:2101.00390, license:cc-by-nc-4.0, region:us
13
14
15
16
17
# Wav2Vec2-base-VoxPopuli-V2
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained only in **sv** on **16.3k** unlabeled datat of the [VoxPopuli corpus](https://arxiv.org/abs/2101.00390).
The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.