patrickvonplaten commited on
Commit
ce6f763
1 Parent(s): 0aaac34

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -3
README.md CHANGED
@@ -13,9 +13,11 @@ license: apache-2.0
13
 
14
  ![model image](https://raw.githubusercontent.com/patrickvonplaten/scientific_images/master/xls_r.png)
15
 
16
- [Facebook's Wav2Vec2 XLS-R](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/)
17
 
18
- XLS-R is Facebook AI's large-scale multilingual pretrained model for speech (the "XLM-R for Speech"). It is pretrained on 436k hours of unlabeled speech, including VoxPopuli, MLS, CommonVoice, BABEL and VoxLingua107. Is uses the wav2vec 2.0 objective, in 128 languages. When using the model make sure that your speech input is sampled at 16Khz. Note that this model should be fine-tuned on a downstream task, like Automatic Speech Recognition, Translation or Classification. Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more information about ASR.
 
 
19
 
20
  [XLS-R Paper](https://arxiv.org/abs/)
21
 
@@ -28,9 +30,10 @@ The original model can be found under https://github.com/pytorch/fairseq/tree/ma
28
 
29
  # Usage
30
 
31
- See [this notebook](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Fine_Tune_XLSR_Wav2Vec2_on_Turkish_ASR_with_%F0%9F%A4%97_Transformers.ipynb) for more information on how to fine-tune the model.
32
 
33
  You can find other pretrained XLS-R models with different numbers of parameters:
 
34
  * [300M parameters version](https://huggingface.co/facebook/wav2vec2-xls-r-300m)
35
  * [1B version version](https://huggingface.co/facebook/wav2vec2-xls-r-1b)
36
  * [2B version version](https://huggingface.co/facebook/wav2vec2-xls-r-2b)
 
13
 
14
  ![model image](https://raw.githubusercontent.com/patrickvonplaten/scientific_images/master/xls_r.png)
15
 
16
+ [Facebook's Wav2Vec2 XLS-R](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) counting **1 billion** parameters.
17
 
18
+ XLS-R is Facebook AI's large-scale multilingual pretrained model for speech (the "XLM-R for Speech"). It is pretrained on 436k hours of unlabeled speech, including VoxPopuli, MLS, CommonVoice, BABEL, and VoxLingua107. It uses the wav2vec 2.0 objective, in 128 languages. When using the model make sure that your speech input is sampled at 16kHz.
19
+
20
+ **Note**: This model should be fine-tuned on a downstream task, like Automatic Speech Recognition, Translation, or Classification. Check out [this blog](https://huggingface.co/blog/fine-tune-xlsr-wav2vec2) for more information about ASR.
21
 
22
  [XLS-R Paper](https://arxiv.org/abs/)
23
 
 
30
 
31
  # Usage
32
 
33
+ See [this google colab](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Fine_Tune_XLS_R_on_Common_Voice.ipynb) for more information on how to fine-tune the model.
34
 
35
  You can find other pretrained XLS-R models with different numbers of parameters:
36
+
37
  * [300M parameters version](https://huggingface.co/facebook/wav2vec2-xls-r-300m)
38
  * [1B version version](https://huggingface.co/facebook/wav2vec2-xls-r-1b)
39
  * [2B version version](https://huggingface.co/facebook/wav2vec2-xls-r-2b)