Kotoba-Whisper: kotoba-whisper-v1.0 for Whisper cpp

This repository contains the model weights for kotoba-tech/kotoba-whisper-v1.0 converted to GGML format. GGML is the weight format expected by C/C++ packages such as Whisper.cpp, for which we provide an example below.


Kotoba-Whisper can be run with the Whisper.cpp package with the original sequential long-form transcription algorithm.

Steps for getting started:

  1. Clone the Whisper.cpp repository:
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp
  1. Install the Hugging Face Hub Python package:
pip install --upgrade huggingface_hub

And download the GGML weights for distil-large-v3 using the following Python snippet:

from huggingface_hub import hf_hub_download

hf_hub_download(repo_id='kotoba-tech/kotoba-whisper-v1.0-ggml', filename='ggml-kotoba-whisper-v1.0.bin', local_dir='./models')

Note that if you do not have a Python environment set-up, you can also download the weights directly with wget:

wget https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0-ggml/resolve/main/ggml-kotoba-whisper-v1.0.bin -P ./models
  1. Run inference using the provided sample audio:
make -j && ./main -m models/ggml-kotoba-whisper-v1.0.bin -f samples/jfk.wav

Model Details

For more information about the kotoba-whisper-v1.0, refer to the original model card.

