Quantization Request: Imatrix GGUF (IQ4_XS) for Qwopus3.5-122B-A10B-Kimi-K2.6-abliterated
Hi @mradermacher , could you please work your Imatrix quantization magic on this amazing abliterated model?
Source Model: https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated
The current available Q4_K_M is over 74GB, which instantly OOMs on standard 64GB VRAM (e.g., dual GPU) setups.
We really need an i1-IQ4_XS version (ideally targeting a file size of 62GB - 63GB). This specific size is the ultimate sweet spot that allows 64GB VRAM users to actually run this Kimi-destilled 122B beast locally with enough context window.
Thanks a lot for your continuous hard work for the local AI community!
You can check the readme for info about quant sizes, anyways...
It's queued!
You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated-GGUF for quants to appear.