Problem with Quants
Hey, noob here, it appears that the quants you made don't work. I used koboldcpp's official colab and well, it outputted this error:
error loading model: create_tensor: tensor 'output_norm.weight' not found
Bluenipples quant of this worked fine, though, correct me if I'm wrong.
Hello, dont have knowledge about koboldcpp but have to try it. Works fine from llamacpp
I suppose you can check out koboldcpp github page for more info. I myself don't use llamacpp as koboldcpp is more user friendly, and I exclusively use colab (specifically the one that uses koboldcpp) since I don't have any capable hardware. Well, your DaringLotus gguf quant worked, dunno what you did differently here.
Yeah, these do actually look the wrong file sizes. q8 should be about 11gb for eg. Check these file sizes for point of comparison (same model size):
https://huggingface.co/NyxKrage/FrostMaid-10.7B-TESTING-GGUF/tree/main
Your 4km is smaller than my 3km I believe:
https://huggingface.co/BlueNipples/DaringLotus-SnowLotus-10.7b-IQ-GGUF/tree/main
File size is a function of model size and bits per weight, so that shouldn't happen.
Yup, something did not work cause inference in model size seems to be empty, working on it rn, ill let you know,
Yup, something did not work cause inference in model size seems to be empty, working on it rn, ill let you know,
Sweet. Only caught this because we were talking about quant size on discord and I looked. All good, I'm sure you'll fix it! Nice thing to have done in any case.