Converted https://huggingface.co/efwkjn/whisper-ja-1.5B to a CT2 model. Tested compatibility with https://github.com/m-bain/whisperX.
Conversion steps were the following:
- Generate tokenizer.json
Make sure transformers and ctranslate2 is installed.
In the interactive python shell:
# Import the transformers module
import transformers
# Import the AutoTokenizer submodule (so you won't run into NameError: name 'AutoTokenizer' is not defined)
from transformers import AutoTokenizer
# Load tokenizer from existing files
tokenizer = AutoTokenizer.from_pretrained("<source_model_dir>")
# Save the tokenizer data as tokenizer.json
tokenizer.save_pretrained("<output_dir>", legacy_format=False)
Copy all generated files (tokenizer_config.json, tokenizer.json) into source_model_dir
Generate CT2 Model
ct2-transformers-converter --model <model_dir> --copy_files tokenizer.json preprocessor_config.json --output_dir <output_dir> --quantization float32
- Downloads last month
- 108
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Jim6789/whisper-ja-1.5B-ct2
Base model
efwkjn/whisper-ja-1.5B