Automatic Speech Recognition
NeMo
PyTorch
English
speech
streaming
audio
Transducer
Conformer
CTC
NeMo
Eval Results
vnoroozi commited on
Commit
fafc2f5
1 Parent(s): 0da8c56

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -4
README.md CHANGED
@@ -66,15 +66,21 @@ img {
66
 
67
  This collection contains large size versions of cache-aware FastConformer-Hybrid (around 114M parameters) with multiple look-ahead support, trained on a large scale english speech.
68
  These models are trained for streaming ASR which be used for streaming applications with a variety of latencies (0ms, 80ms, 480s, 1040ms).
69
- All models are hybrid with both Transducer and CTC decoders.
70
- See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#conformer-transducer) for complete architecture details.
71
 
72
 
73
  ## Model Architecture
74
 
75
- FastConformer [4] is an optimized version of the Conformer model [1]. The model is trained in a multitask setup with joint Transducer and CTC decoder loss. You may find more information on the details of FastConformer here: [Fast-Conformer Model](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#fast-conformer) and about Hybrid Transducer-CTC training here: [Hybrid Transducer-CTC](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#hybrid-transducer-ctc). You may find more on how to switch between the Transducer and CTC decoders in the documentations.
 
 
76
 
77
- These models are cache-aware versions of Hybrid FastConfomer which are trianed for streaming ASR. You may find more info on cache-aware models here: [Cache-aware Streaming Conformer](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#cache-aware-streaming-conformer). The models are trained with multiple look-aheads which makes the model to be able to support different latencies. To learn on how to switch between different look-aheads, you may read the documentations on the cache-aware models.
 
 
 
 
 
 
78
 
79
 
80
  ## Training
 
66
 
67
  This collection contains large size versions of cache-aware FastConformer-Hybrid (around 114M parameters) with multiple look-ahead support, trained on a large scale english speech.
68
  These models are trained for streaming ASR which be used for streaming applications with a variety of latencies (0ms, 80ms, 480s, 1040ms).
 
 
69
 
70
 
71
  ## Model Architecture
72
 
73
+ These models are cache-aware versions of Hybrid FastConfomer which are trianed for streaming ASR. You may find more info on cache-aware models here: [Cache-aware Streaming Conformer](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#cache-aware-streaming-conformer).
74
+ The models are trained with multiple look-aheads which makes the model to be able to support different latencies.
75
+ To learn on how to switch between different look-aheads, you may read the documentations on the cache-aware models.
76
 
77
+ FastConformer [4] is an optimized version of the Conformer model [1], and
78
+ you may find more information on the details of FastConformer here: [Fast-Conformer Model](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#fast-conformer).
79
+
80
+ The model is trained in a multitask setup with joint Transducer and CTC decoder loss. You can find more about Hybrid Transducer-CTC training here: [Hybrid Transducer-CTC](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#hybrid-transducer-ctc).
81
+ You may also find more on how to switch between the Transducer and CTC decoders in the documentations.
82
+
83
+
84
 
85
 
86
  ## Training