sdelangen commited on
Commit
49c72c6
1 Parent(s): 368632e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -9
README.md CHANGED
@@ -77,17 +77,11 @@ With streaming, the results with different chunk sizes on test-clean are the fol
77
 
78
  ## Pipeline description
79
 
80
- TODO
81
-
82
- This ASR system is composed of 3 different but linked blocks:
83
- - Tokenizer (unigram) that transforms words into subword units and trained with
84
- the train transcriptions of LibriSpeech.
85
- - Neural language model (Transformer LM) trained on the full 10M words dataset.
86
- - Acoustic model made of a conformer encoder and a joint decoder with CTC +
87
- transformer. Hence, the decoding also incorporates the CTC probabilities.
88
 
89
  The system is trained with recordings sampled at 16kHz (single channel).
90
- The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling *transcribe_file* if needed.
91
 
92
  ## Install SpeechBrain
93
 
 
77
 
78
  ## Pipeline description
79
 
80
+ This ASR system is a Conformer model trained with the RNN-T loss (with an auxiliary CTC loss to stabilize training). The model operates with a unigram tokenizer.
81
+ Architecture details are described in the [training hyperparameters file](https://github.com/speechbrain/speechbrain/blob/develop/recipes/LibriSpeech/ASR/transducer/hparams/conformer_transducer.yaml).
 
 
 
 
 
 
82
 
83
  The system is trained with recordings sampled at 16kHz (single channel).
84
+ The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling `transcribe_file` if needed.
85
 
86
  ## Install SpeechBrain
87