Spaces:

ggml-org
/

gguf-my-repo

Running on A10G

App Files Files Community

129

Add IQ Quantization support with the help of imatrix and GPUs

#35

by qnixsynapse - opened Apr 7

Discussion

qnixsynapse

Apr 7

It will allows us to create imatrix data and quants with one go!

hus960

Apr 22

Super useful when we deal with 100b+ models , 1m bit is really nice to support.

rinaldow

Apr 22

Would be awesome to see options for IQ 6 / 5 /4 /3 / 2 NL / XS

Dampfinchen

May 10

Would be really awesome to have an option to upload a txt for imatrix creation and then create imatrix quants with it.

israellaguan

Jun 3

#78 should help here if merged

reach-vb

ggml.ai org Jun 11

We just merged support for iMatrix! Do let us know if you have any feedback! 🤗

qnixsynapse

Jun 12

@reach-vb Just gave it a try. I have one suggestion. Currently, it is impossible for anyone to see the progress because the gradio only shows the loading indicator. I think it would be better if console logs are shown instead. This will allow us to track the progress and inspect any errors encountered during calculation/conversion. :)

Thanks to you and everybody else involved. I should close this discussion now. :)

qnixsynapse changed discussion status to closed Jun 12

reach-vb

ggml.ai org Jun 12

That's a brilliant feedback!

qnixsynapse

Jul 15

@reach-vb Hi! When llama.cpp gets updated here?
Sorry for bothering you but currently Gemma(9B) conversation fails because of an assert, which has been fixed upstream.

Need atleast b3389 for the fix.

I was not sure how to contact you, so commented here. 😃

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment