digikar commited on
Commit
8f0745b
1 Parent(s): 2b176d4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -3
README.md CHANGED
@@ -1,3 +1,24 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - hi
5
+ pipeline_tag: automatic-speech-recognition
6
+ ---
7
+
8
+ int8 quantized [ctranslate2](https://github.com/OpenNMT/CTranslate2)-compatible version of [vasista22/whisper-hindi-large-v2](https://huggingface.co/vasista22/whisper-hindi-large-v2).
9
+ This means the 5.7GB model is compressed into 1.6GB :).
10
+
11
+ Model created using
12
+
13
+ ```
14
+ ct2-transformers-converter --model /path/to/vasista22/whisper-hindi-large-v2 --output_dir whisper-hindi-large-v2-ct2-int8 --copy_files tokenizer_config.json preprocessor_config.json added_tokens.json special_tokens_map.json --quantization int8
15
+ ```
16
+
17
+ For monospeaker audio, use either of
18
+
19
+ 1. [ctranslate2](https://github.com/OpenNMT/CTranslate2)
20
+ 2. [faster-whisper](https://github.com/SYSTRAN/faster-whisper)
21
+
22
+ For multispeaker audio with english diarization, use [whisperX](https://github.com/m-bain/whisperX/).
23
+
24
+ For multispeaker audio with non-english diarization, use [whisper-diarization](https://github.com/MahmoudAshraf97/whisper-diarization/).