speechbrain
/

asr-streaming-conformer-librispeech

Automatic Speech Recognition

Model card Files Files and versions Community

sdelangen commited on Feb 26

Commit

ae3d580

•

1 Parent(s): c5063b0

Update README.md

Files changed (1) hide show

README.md +49 -0

README.md CHANGED Viewed

@@ -113,6 +113,55 @@ asr_model.transcribe_file(
 )
 ```
 ### Inference on GPU
 To perform inference on the GPU, add  `run_opts={"device":"cuda"}`  when calling the `from_hparams` method.

 )
 ```
+<details>
+<summary>Commandline tool to transcribe a file or a live stream</summary>
+**Decoding from a live stream using ffmpeg (BBC Radio 4):** `python3 asr.py http://as-hls-ww-live.akamaized.net/pool_904/live/ww/bbc_radio_fourfm/bbc_radio_fourfm.isml/bbc_radio_fourfm-audio%3d96000.norewind.m3u8 --model-source=sdelangen/speechbrain-asr-conformer-test --device=cpu -v`
+**Decoding from a file:** `python3 asr.py some-english-speech.wav --model-source=sdelangen/speechbrain-asr-conformer-test --device=cpu -v`
+```python
+from argparse import ArgumentParser
+import logging
+parser = ArgumentParser()
+parser.add_argument("audio_path")
+parser.add_argument("--model-source", required=True)
+parser.add_argument("--device", default="cpu")
+parser.add_argument("--ip", default="127.0.0.1")
+parser.add_argument("--port", default=9431)
+parser.add_argument("--chunk-size", default=24, type=int)
+parser.add_argument("--left-context-chunks", default=4, type=int)
+parser.add_argument("--num-threads", default=None, type=int)
+parser.add_argument("--verbose", "-v", default=False, action="store_true")
+args = parser.parse_args()
+if args.verbose:
+    logging.getLogger().setLevel(logging.INFO)
+logging.info("Loading libraries")
+from speechbrain.inference.ASR import StreamingASR
+from speechbrain.utils.dynamic_chunk_training import DynChunkTrainConfig
+import torch
+device = args.device
+if args.num_threads is not None:
+    torch.set_num_threads(args.num_threads)
+logging.info(f"Loading model from \"{args.model_source}\" onto device {device}")
+asr = StreamingASR.from_hparams(args.model_source, run_opts={"device": device})
+config = DynChunkTrainConfig(args.chunk_size, args.left_context_chunks)
+logging.info(f"Starting stream from URI \"{args.audio_path}\"")
+for text_chunk in asr.transcribe_file_streaming(args.audio_path, config):
+    print(text_chunk, flush=True, end="")
+```
+</details>
 ### Inference on GPU
 To perform inference on the GPU, add  `run_opts={"device":"cuda"}`  when calling the `from_hparams` method.