# GGML 4-bit/5-bit quantized IDEA-CCNL/Ziya-LLaMA-13B-v1 * You need the latest version of llama-cpp or llama-cpp-python (to support ggml format v3). * Currently llama-cpp can not tokenize '\', '\' special tokens, I changed these to šŸ¤–šŸ§‘ emojis. * Promote like this: ```python inputs = 'šŸ§‘:' + query.strip() + '\nšŸ¤–:' ``` * If you wanna quantize Ziya to GGML yourself, you should override its 'add_tokens.json' file with ours, which is provided in this repository. --- license: gpl-3.0 ---