https://huggingface.co/jeiku/Personal_4B
Looking mainly for imatrix 4_0_4_8. This is probably the pinnacle for 4B and was produced by a three step training process. Thanks in advance!
It's queued. The Q4_0_4_8 quant will be generated as part of the imatrtix ones (hopefully :)
I just realized I may not have fixed the configs on this one yet. But I'd be excited to see if it works without adjusting (it's an axo thing)
Let me know if you have any trouble and I'll fix it asap.
Well, it already was delayed because my scheduler was surprised by a 4B taking 82GB of disk space :)
Yeah I didn't clean the repo at all.... Sorry, I just had a hankering to chat with this and I made it a few months back. If you give me a few minutes I can clean it up.
https://huggingface.co/jeiku/Personal_4B/tree/main
Okay, deleted all the junk and made the config changes that worked during the initial testing phase. Sorry about that. This should give you no issues.
All works fine it seems. And it's totally fine to have checkpoints in the repo etc., it's just that the disk space budget was suddenly negative, but I caught it in time.
Also, the config changes are not reflected in the gguf because the download came first - should I redo it?
If the GGUF works then it works, but I was not able to inference the GGUF created from the old configs for several sisters of this model.
It appears that axolotl somehow alters the parameters. Have verified with several colleagues that this occurs.
Well, a common issue is that transformers have amultitude of tokenizers, even when using only the in-built one transformers might only use the fasttokenizer, while llama.cpp might want to read the old format and so on.