digikar's picture
Update README.md
8f0745b verified
metadata
license: apache-2.0
language:
  - hi
pipeline_tag: automatic-speech-recognition

int8 quantized ctranslate2-compatible version of vasista22/whisper-hindi-large-v2. This means the 5.7GB model is compressed into 1.6GB :).

Model created using

ct2-transformers-converter --model /path/to/vasista22/whisper-hindi-large-v2 --output_dir whisper-hindi-large-v2-ct2-int8 --copy_files tokenizer_config.json preprocessor_config.json added_tokens.json special_tokens_map.json --quantization int8

For monospeaker audio, use either of

  1. ctranslate2
  2. faster-whisper

For multispeaker audio with english diarization, use whisperX.

For multispeaker audio with non-english diarization, use whisper-diarization.