Titouan commited on
Commit
381bad0
2 Parent(s): 0a69b53 6bbde74

Merge branch 'main' of https://huggingface.co/speechbrain/asr-transformer-transformerlm-librispeech into main

Browse files
Files changed (2) hide show
  1. README.md +7 -7
  2. hyperparams.yaml +5 -2
README.md CHANGED
@@ -5,7 +5,7 @@ tags:
5
  - ASR
6
  - CTC
7
  - Attention
8
- - Tranformer
9
  - pytorch
10
  license: "apache-2.0"
11
  datasets:
@@ -19,7 +19,7 @@ metrics:
19
 
20
  This repository provides all the necessary tools to perform automatic speech
21
  recognition from an end-to-end system pretrained on LibriSpeech (EN) within
22
- SpeechBrain. For a better experience we encourage you to learn more about
23
  [SpeechBrain](https://speechbrain.github.io). The given ASR model performance are:
24
 
25
  | Release | Test clean WER | Test other WER | GPUs |
@@ -28,18 +28,18 @@ SpeechBrain. For a better experience we encourage you to learn more about
28
 
29
  ## Pipeline description
30
 
31
- This ASR system is composed with 3 different but linked blocks:
32
  1. Tokenizer (unigram) that transforms words into subword units and trained with
33
  the train transcriptions of LibriSpeech.
34
  2. Neural language model (Transformer LM) trained on the full 10M words dataset.
35
  3. Acoustic model made of a transformer encoder and a joint decoder with CTC +
36
- transformer. Hence, the decoding also incorporate the CTC probabilities.
37
 
38
  ## Intended uses & limitations
39
 
40
- This model has been primilarly developed to be run within SpeechBrain as a pretrained ASR model
41
- for the english language. Thanks to the flexibility of SpeechBrain, any of the 3 blocks
42
- detailed above can be extracted and connected to you custom pipeline as long as SpeechBrain is
43
  installed.
44
 
45
  ## Install SpeechBrain
5
  - ASR
6
  - CTC
7
  - Attention
8
+ - Transformer
9
  - pytorch
10
  license: "apache-2.0"
11
  datasets:
19
 
20
  This repository provides all the necessary tools to perform automatic speech
21
  recognition from an end-to-end system pretrained on LibriSpeech (EN) within
22
+ SpeechBrain. For a better experience, we encourage you to learn more about
23
  [SpeechBrain](https://speechbrain.github.io). The given ASR model performance are:
24
 
25
  | Release | Test clean WER | Test other WER | GPUs |
28
 
29
  ## Pipeline description
30
 
31
+ This ASR system is composed of 3 different but linked blocks:
32
  1. Tokenizer (unigram) that transforms words into subword units and trained with
33
  the train transcriptions of LibriSpeech.
34
  2. Neural language model (Transformer LM) trained on the full 10M words dataset.
35
  3. Acoustic model made of a transformer encoder and a joint decoder with CTC +
36
+ transformer. Hence, the decoding also incorporates the CTC probabilities.
37
 
38
  ## Intended uses & limitations
39
 
40
+ This model has been primarily developed to be run within SpeechBrain as a pretrained ASR model
41
+ for the English language. Thanks to the flexibility of SpeechBrain, any of the 3 blocks
42
+ detailed above can be extracted and connected to your custom pipeline as long as SpeechBrain is
43
  installed.
44
 
45
  ## Install SpeechBrain
hyperparams.yaml CHANGED
@@ -118,14 +118,17 @@ lm_model: !new:speechbrain.lobes.models.transformer.TransformerLM.TransformerLM
118
 
119
  tokenizer: !new:sentencepiece.SentencePieceProcessor
120
 
 
 
 
 
121
  # Models
122
  asr_model: !new:torch.nn.ModuleList
123
  - [!ref <CNN>, !ref <Transformer>, !ref <seq_lin>, !ref <ctc_lin>]
124
 
125
  modules:
126
  compute_features: !ref <compute_features>
127
- pre_transformer: !ref <CNN>
128
- transformer: !ref <Transformer>
129
  asr_model: !ref <asr_model>
130
  normalize: !ref <normalize>
131
  lm_model: !ref <lm_model>
118
 
119
  tokenizer: !new:sentencepiece.SentencePieceProcessor
120
 
121
+ asr_encoder: !speechbrain.utils.callchains.LengthsCapableChain
122
+ - !ref <CNN>
123
+ - !ref <Transformer.encode>
124
+
125
  # Models
126
  asr_model: !new:torch.nn.ModuleList
127
  - [!ref <CNN>, !ref <Transformer>, !ref <seq_lin>, !ref <ctc_lin>]
128
 
129
  modules:
130
  compute_features: !ref <compute_features>
131
+ asr_encoder: !ref <asr_encoder>
 
132
  asr_model: !ref <asr_model>
133
  normalize: !ref <normalize>
134
  lm_model: !ref <lm_model>