KnutJaegersberg/Qwen-1_8B-Llamafied

KatyTheCutie

Jan 4

Would you please convert this into a GGUF format? Thank you.

KatyTheCutie

Jan 4

And could you please make GGUF quants of your existing Platypus and Deacon models?

KnutJaegersberg

Owner Jan 4

To my knowledge ggml already supports the original Qwen architecture. Hence there is already ggufs:

https://huggingface.co/zly/Qwen-1_8B-Chat-Int4-GGUF
https://huggingface.co/Qwen/Qwen-1_8B-Chat

It's possible to gguf the original model, easily, too.
This llamafication is mostly for easier fine tuning and for broader llama ecosystem compatibilty.
Nonetheless, I wanted to guff these, too. I think 6bit is a pretty much lossless quantization and there is indication you can do 4 bit with this particular model with very low quality loss.

I'm uploading the 6 bit quants now, give it a minute:
https://huggingface.co/KnutJaegersberg/Qwen-1_8B-gguf

KnutJaegersberg changed discussion status to closed Jan 4

KatyTheCutie

Jan 4

Thank you!

KnutJaegersberg
/

Qwen-1_8B-Llamafied

Possible GGUF