speechbrain
/

asr-wav2vec2-commonvoice-rw

Automatic Speech Recognition

hf-asr-leaderboard

Model card Files Files and versions Community

Titouan commited on Jun 3, 2021

Commit

24632ba

•

1 Parent(s): 115e3dd

read

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -30,14 +30,14 @@ The performance of the model is the following:
 | Release | Test WER | GPUs |
 |:--------------:|:--------------:| :--------:|
-| 03-06-21 | 15.69 | 2xV100 32GB |
 ## Pipeline description
 This ASR system is composed of 2 different but linked blocks:
 - Tokenizer (unigram) that transforms words into subword units and trained with
 the train transcriptions (train.tsv) of CommonVoice (RW).
-- Acoustic model (wav2vec2.0 + CTC/Attention). A pretrained wav2vec 2.0 model ([wav2vec2-lv60-large](https://huggingface.co/facebook/wav2vec2-large-lv60)) is combined with two DNN layers and finetuned on CommonVoice En.
 The obtained final acoustic representation is given to the CTC and attention decoders.
@@ -81,7 +81,7 @@ pip install -e .
 3. Run Training:
 ```bash
 cd recipes/CommonVoice/ASR/seq2seq
-python train.py hparams/train_fr_with_wav2vec.yaml --data_folder=your_data_folder
 ```
 You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1tjz6IZmVRkuRE97E7h1cXFoGTer7pT73?usp=sharing).

 | Release | Test WER | GPUs |
 |:--------------:|:--------------:| :--------:|
+| 03-06-21 | 18.91 | 2xV100 32GB |
 ## Pipeline description
 This ASR system is composed of 2 different but linked blocks:
 - Tokenizer (unigram) that transforms words into subword units and trained with
 the train transcriptions (train.tsv) of CommonVoice (RW).
+- Acoustic model (wav2vec2.0 + CTC/Attention). A pretrained wav2vec 2.0 model ([wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)) is combined with two DNN layers and finetuned on CommonVoice En.
 The obtained final acoustic representation is given to the CTC and attention decoders.
 3. Run Training:
 ```bash
 cd recipes/CommonVoice/ASR/seq2seq
+python train.py hparams/train_rw_with_wav2vec.yaml --data_folder=your_data_folder
 ```
 You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1tjz6IZmVRkuRE97E7h1cXFoGTer7pT73?usp=sharing).