Titouan commited on
Commit
f627588
1 Parent(s): 7ff95b4

update readme

Browse files

Files changed (1) hide show
  1. README.md +4 -7
README.md CHANGED
@@ -24,7 +24,7 @@ SpeechBrain. For a better experience we encourage you to learn more about
24
 
25
  | Release | Test clean WER | Test other WER | GPUs |
26
  |:-------------:|:--------------:|:--------------:|:--------:|
27
- | 05-03-21 | 2.90 | 8.51 | 1xV100 16GB |
28
 
29
  ## Pipeline description
30
 
@@ -32,11 +32,8 @@ This ASR system is composed with 3 different but linked blocks:
32
  1. Tokenizer (unigram) that transforms words into subword units and trained with
33
  the train transcriptions of LibriSpeech.
34
  2. Neural language model (Transformer LM) trained on the full 10M words dataset.
35
- 3. Acoustic model (CRDNN + CTC/Attention). The CRDNN architecture is made of
36
- N blocks of convolutional neural networks with normalisation and pooling on the
37
- frequency domain. Then, a bidirectional LSTM with projection layers is connected
38
- to a final DNN to obtain the final acoustic representation that is given to
39
- the CTC and attention decoders.
40
 
41
  ## Intended uses & limitations
42
 
@@ -61,7 +58,7 @@ Please notice that we encourage you to read our tutorials and learn more about
61
  ```python
62
  from speechbrain.pretrained import EncoderDecoderASR
63
 
64
- asr_model = EncoderDecoderASR.from_hparams(source="speechbrain/asr-crdnn-transformerlm-librispeech")
65
  asr_model.transcribe_file("path_to_your_file.wav")
66
 
67
  ```
24
 
25
  | Release | Test clean WER | Test other WER | GPUs |
26
  |:-------------:|:--------------:|:--------------:|:--------:|
27
+ | 05-03-21 | 2.55 | 5.99 | 2xV100 32GB |
28
 
29
  ## Pipeline description
30
 
32
  1. Tokenizer (unigram) that transforms words into subword units and trained with
33
  the train transcriptions of LibriSpeech.
34
  2. Neural language model (Transformer LM) trained on the full 10M words dataset.
35
+ 3. Acoustic model made of a transformer encoder and a joint decoder with CTC +
36
+ transformer. Hence, the decoding also incorporate the CTC probabilities.
 
 
 
37
 
38
  ## Intended uses & limitations
39
 
58
  ```python
59
  from speechbrain.pretrained import EncoderDecoderASR
60
 
61
+ asr_model = EncoderDecoderASR.from_hparams(source="speechbrain/asr-transformer-transformerlm-librispeech")
62
  asr_model.transcribe_file("path_to_your_file.wav")
63
 
64
  ```