Hey :) LMstudio support and picking your brain

#2
by PsiPi - opened

Hi. I did the 13B GGUF for these and was wondering if you would be so kind as to point me at the docs / script / something you used to compress the CLIP mmproj - I recall seeing something like that being available on some linux branch of something but for the life of me can't dredge it up.
Would really appreciate the assist.

In other news. To make this LMstudio compatible OOTB you might consider renaming the adaptors to mmproj-Q4_0.gguf (et al)
image.png

then it will "just work TM"
image.png

Thanks for this version.

In addition, if you were provide the file llava.preset.json as shown here

image.png

in the repo like this

image.png

that would also preload the LMstudio template. Hope it helps

mozilla org

Hey! I heard about your project for the first time a few days ago. It's pretty cool.

If you need to quantize the clip mmproj files, you can use the llamafile-llava-quantize-0.2.1 program from https://github.com/Mozilla-Ocho/llamafile/releases/tag/0.2.1

All the code that's needed for doing this is in the llama.cpp project upstream. But no one wrote a main() function for it until I came along. I hope you find it useful. I'm very much forward to using LLaVA 13B when you post it, since I honestly have no idea how to quantize the other file!

jartine changed discussion status to closed

Thanks very much. I'm also on the verge of releasing NOUS . again I will eventually be using the encoding quantiser when I get it working - in the interim Ill just post what I have.

Sign up or log in to comment