khanhld commited on
Commit
8f5a52a
·
1 Parent(s): 913c535

update usage

Browse files
Files changed (1) hide show
  1. README.md +46 -16
README.md CHANGED
@@ -133,39 +133,69 @@ We evaluate the models using **Word Error Rate (WER)**. To ensure consistency an
133
  ## Quick Usage
134
  To use the ChunkFormer model for Vietnamese Automatic Speech Recognition, follow these steps:
135
 
136
- 1. **Download the ChunkFormer Repository**
137
  ```bash
138
- git clone https://github.com/khanld/chunkformer.git
139
- cd chunkformer
140
- pip install -r requirements.txt
141
  ```
142
- 2. **Download the Model Checkpoint from Hugging Face**
 
143
  ```bash
144
- pip install huggingface_hub
145
- huggingface-cli download khanhld/chunkformer-large-vie --local-dir "./chunkformer-large-vie"
 
146
  ```
147
- or
148
- ```bash
149
- git lfs install
150
- git clone https://huggingface.co/khanhld/chunkformer-large-vie
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
151
  ```
152
- This will download the model checkpoint to the checkpoints folder inside your chunkformer directory.
153
 
154
- 3. **Run the model**
 
 
155
  ```bash
156
- python decode.py \
157
- --model_checkpoint path/to/local/chunkformer-large-vie \
158
  --long_form_audio path/to/audio.wav \
159
- --total_batch_duration 14400 \ #in second, default is 1800
160
  --chunk_size 64 \
161
  --left_context_size 128 \
162
  --right_context_size 128
163
  ```
 
164
  Example Output:
165
  ```
166
  [00:00:01.200] - [00:00:02.400]: this is a transcription example
167
  [00:00:02.500] - [00:00:03.700]: testing the long-form audio
168
  ```
 
169
  **Advanced Usage** can be found [HERE](https://github.com/khanld/chunkformer/tree/main?tab=readme-ov-file#usage)
170
 
171
  ---
 
133
  ## Quick Usage
134
  To use the ChunkFormer model for Vietnamese Automatic Speech Recognition, follow these steps:
135
 
136
+ ### Option 1: Install from PyPI (Recommended)
137
  ```bash
138
+ pip install chunkformer
 
 
139
  ```
140
+
141
+ ### Option 2: Install from source
142
  ```bash
143
+ git clone https://github.com/khanld/chunkformer.git
144
+ cd chunkformer
145
+ pip install -e .
146
  ```
147
+
148
+ ### Python API Usage
149
+ ```python
150
+ from chunkformer import ChunkFormerModel
151
+
152
+ # Load the Vietnamese model from Hugging Face
153
+ model = ChunkFormerModel.from_pretrained("khanhld/chunkformer-large-vie")
154
+
155
+ # For single long-form audio transcription
156
+ transcription = model.endless_decode(
157
+ audio_path="path/to/long_audio.wav",
158
+ chunk_size=64,
159
+ left_context_size=128,
160
+ right_context_size=128,
161
+ total_batch_duration=14400, # in seconds
162
+ return_timestamps=True
163
+ )
164
+ print(transcription)
165
+
166
+ # For batch processing of multiple audio files
167
+ audio_files = ["audio1.wav", "audio2.wav", "audio3.wav"]
168
+ transcriptions = model.batch_decode(
169
+ audio_paths=audio_files,
170
+ chunk_size=64,
171
+ left_context_size=128,
172
+ right_context_size=128,
173
+ total_batch_duration=1800 # Total batch duration in seconds
174
+ )
175
+
176
+ for i, transcription in enumerate(transcriptions):
177
+ print(f"Audio {i+1}: {transcription}")
178
  ```
 
179
 
180
+ ### Command Line Usage
181
+ After installation, you can use the command line interface:
182
+
183
  ```bash
184
+ chunkformer-decode \
185
+ --model_checkpoint khanhld/chunkformer-large-vie \
186
  --long_form_audio path/to/audio.wav \
187
+ --total_batch_duration 14400 \
188
  --chunk_size 64 \
189
  --left_context_size 128 \
190
  --right_context_size 128
191
  ```
192
+
193
  Example Output:
194
  ```
195
  [00:00:01.200] - [00:00:02.400]: this is a transcription example
196
  [00:00:02.500] - [00:00:03.700]: testing the long-form audio
197
  ```
198
+
199
  **Advanced Usage** can be found [HERE](https://github.com/khanld/chunkformer/tree/main?tab=readme-ov-file#usage)
200
 
201
  ---