makaveli10 commited on
Commit
201d841
·
1 Parent(s): f1e930a

add README

Browse files
Files changed (1) hide show
  1. README.md +74 -1
README.md CHANGED
@@ -1 +1,74 @@
1
- # WhisperBot
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # WhisperBot
2
+ Welcome to WhisperBot. WhisperBot builds upon the capabilities of the [WhisperLive]() by integrating Mistral, a Large Language Model (LLM), on top of the real-time speech-to-text pipeline. WhisperLive relies on OpenAI Whisper, a powerful automatic speech recognition (ASR) system. Both Mistral and Whisper are optimized to run efficiently as TensorRT engines, maximizing performance and real-time processing capabilities.
3
+
4
+ ## Features
5
+ - **Real-Time Speech-to-Text**: Utilizes OpenAI WhisperLive to convert spoken language into text in real-time.
6
+
7
+ - **Large Language Model Integration**: Adds Mistral, a Large Language Model, to enhance the understanding and context of the transcribed text.
8
+
9
+ - **TensorRT Optimization**: Both Mistral and Whisper are optimized to run as TensorRT engines, ensuring high-performance and low-latency processing.
10
+
11
+ ## Prerequisites
12
+ Install [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/installation.md) to build Whisper and Mistral TensorRT engines. The README builds a docker image for TensorRT-LLM.
13
+ Instead of building a docker image, we can also refer to the README and the [Dockerfile.multi](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docker/Dockerfile.multi) to install the required packages in the base pytroch docker image. Just make sure to use the correct base image as mentioned in the dockerfile and everything should go nice and smooth.
14
+
15
+ ### Whisper
16
+ - Change working dir to the [whisper example dir](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/whisper) in TensorRT-LLM.
17
+ ```bash
18
+ cd TensorRT-LLM/examples/whisper
19
+ ```
20
+ - Currently, by default TensorRT-LLM only supports `large-v2` and `large-v3`. In this repo, we use `small.en`.
21
+ - Download the required assets.
22
+ ```bash
23
+ wget --directory-prefix=assets assets/mel_filters.npz https://raw.githubusercontent.com/openai/whisper/main/whisper/assets/mel_filters.npz
24
+
25
+ # small.en model
26
+ wget --directory-prefix=assets https://openaipublic.azureedge.net/main/whisper/models/f953ad0fd29cacd07d5a9eda5624af0f6bcf2258be67c92b79389873d91e0872/small.en.pt
27
+ ```
28
+ - Edit `build.py` to support `small.en`. In order to do that, add `"small.en"` as an item in the list [`choices`](https://github.com/NVIDIA/TensorRT-LLM/blob/a75618df24e97ecf92b8899ca3c229c4b8097dda/examples/whisper/build.py#L58).
29
+ - Build `small.en` TensorRT engine.
30
+ ```bash
31
+ pip install -r requirements.txt
32
+ python3 build.py --output_dir whisper_small_en --use_gpt_attention_plugin --use_gemm_plugin --use_layernorm_plugin --use_bert_attention_plugin
33
+ ```
34
+
35
+ ### Mistral
36
+ - Change working dir to [llama example dir](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/llama) in TensorRT-LLM folder.
37
+ ```bash
38
+ cd TensorRT-LLM/examples/whisper
39
+ ```
40
+ - Convert Mistral to `fp16` TensorRT engine.
41
+ ```bash
42
+ python build.py --model_dir teknium/OpenHermes-2.5-Mistral-7B \
43
+ --dtype float16 \
44
+ --remove_input_padding \
45
+ --use_gpt_attention_plugin float16 \
46
+ --enable_context_fmha \
47
+ --use_gemm_plugin float16 \
48
+ --output_dir ./tmp/mistral/7B/trt_engines/fp16/1-gpu/ \
49
+ --max_input_len 5000
50
+ --max_batch_size 1
51
+ ```
52
+
53
+ ## Run WhisperBot
54
+ - Clone this repo and install requirements.
55
+ ```bash
56
+ git clone https://github.com/collabora/WhisperBot.git
57
+ cd WhisperBot
58
+ apt update
59
+ apt install ffmpeg portaudio19-dev -y
60
+ pip install -r requirements.txt
61
+ ```
62
+
63
+ - Take the folder path for Whisper TensorRT model, folder_path and tokenizer_path for Mistral TensorRT from the build phase. If a huggingface model is used to build mistral then just use the huggingface repo name as the tokenizer path.
64
+ ```bash
65
+ python3 main.py --whisper_tensorrt_path /root/TensorRT-LLM/examples/whisper/whisper_small_en \
66
+ --mistral_tensorrt_path /root/TensorRT-LLM/examples/llama/tmp/mistral/7B/trt_engines/fp16/1-gpu/ \
67
+ --mistral_tokenizer_path teknium/OpenHermes-2.5-Mistral-7B
68
+ ```
69
+ - Use the `WhisperBot/client.py` script to run on the client sidee.
70
+
71
+
72
+ ## Contact Us
73
+ For questions or issues, please open an issue.
74
+ Contact us at: marcus.edel@collabora.com, jpc@collabora.com, vineet.suryan@collabora.com