Quantization Q2_K

#4
by sharonsky - opened

I want to quantize the model in Q2_K gguf but there is not enough merges.txt. Please add merges.txt or tokenizer.model

Where can i find gguf file?

Where can i find gguf file?

gguf file is created, but I can't add a dictionary there, because it should be a SentencePiece. To convert to SentencePiece, I need merges.txt. And without a dictionary, the gguf file is useless. I can upload it, maybe someone will finish it

In comment #5 probably there is a solution.

Almawave org

A pull request in the llama.cpp repository (https://github.com/ggerganov/llama.cpp/pull/11716) has already been submitted to address this issue and is currently under review. You can refer to the fork used for the pull request or wait for the marge to convert the model to GGUF format and to quantize it using the allowed methods (i.e. Q2_K).

Sign up or log in to comment