Invalid Split File
I wonder if koboldCpp does not support this Quant and hence gives me an error ?
or
did I combine them incorrectly ?
in order to check if KoboldCpp does not support IQ3_M quant, I downloaded a similar quant from here = https://huggingface.co/bartowski/TQ2.5-14B-Sugarquill-v1-GGUF/blob/main/TQ2.5-14B-Sugarquill-v1-IQ3_M.gguf.
the idea being if this quant is same quant style and if this works with Kobo then Monstral-123B-IQ3_M should also work ?
and this smaller ( UNSPLIT ) model which is one single file to download, does indeed work with KoboldCpp, so does that mean I am combining the split file incorrectly ?
I am using this command in the CMD window to combine my files =
"
COPY /B Monstral-123B-IQ3_M-00001-of-00002.gguf + Monstral-123B-IQ3_M-00002-of-00002.gguf Monstral-123B-IQ3_M.gguf
"
and the resulting single file is same size as the the total of 2 split file.
the smaller ( UnSplit ) model [ TQ2.5-14B-Sugarquill-v1-IQ3_M.gguf ] loads without any hassle and works with KoboCpp.
You can't combine them that way, you have to use llamacpp's tool
But also you don't have to combine them, you should be able to point koboldcpp at part 1 and it should load the rest of the parts automatically
You can't combine them that way, you have to use llamacpp's tool
But also you don't have to combine them, you should be able to point koboldcpp at part 1 and it should load the rest of the parts automatically
Brilliant! Didn't know, it worked !
When used to download big models from thebloke's hugginface, first used to combine them before using them so thought it must be same with these as well.
Thanks for the help and the quants π
yes that was an older method i also used for a bit :) but llama.cpp built their own way of splitting models a little while back, with the main benefit being that you don't need to combine them before running! that's where the real magic is :D