Quantization Request: Imatrix GGUF (IQ4_XS) for Qwopus3.5-122B-A10B-Kimi-K2.6-abliterated

#2415

by mxdluffy - opened 8 days ago

Discussion

mxdluffy

8 days ago

Hi @mradermacher , could you please work your Imatrix quantization magic on this amazing abliterated model?

Source Model: https://huggingface.co/OpenYourMind/Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated

The current available Q4_K_M is over 74GB, which instantly OOMs on standard 64GB VRAM (e.g., dual GPU) setups.

We really need an i1-IQ4_XS version (ideally targeting a file size of 62GB - 63GB). This specific size is the ultimate sweet spot that allows 64GB VRAM users to actually run this Kimi-destilled 122B beast locally with enough context window.

Thanks a lot for your continuous hard work for the local AI community!

simonko912

8 days ago

You can check the readme for info about quant sizes, anyways...
It's queued!

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Qwopus3.5-122B-A10B-Kimi-K2.6-destill-healed-abliterated-GGUF for quants to appear.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment