Upload Mistral-Nemo-Instruct-2407-Q4_0.gguf

#5
by venketh - opened

GPT4All will use fp32, fp16, and Q4_0 quantizations of a model; add a Q4_0 quantization (created w/ your fp32 file & imatrix.dat) to allow that model to be natively used.

Oh I just noticed this..

Gpt4all can't use anything but Q4_0??

Q4_0, Q4_1, fp16, fp32 . See gpt4all-chat/modellist.cpp

Others likely work w/ manual steps, but those are the four types it can directly use.

that's frustrating.. Guess I'll start including them then πŸ™„

I don't know that we need them for everything, just for a handful of signature models (Mistral 7b instruct 0.3, Nemo Instruct 2407...). WDYT about Q4_0 vs Q4_1 here - want me to send another PR?

(Can send pull requests for each; separately, will try to add Q8_0 over to gpt4all)

Have sent a small PR to gpt4all to add Q8_0 to their supported types: https://github.com/nomic-ai/gpt4all/pull/2919 ;

Would you like me to upload a Q4_1 quantized-file for this (and Mistral instruct 7b v0.3), from your imatrixes?

I'll throw up a Q4_0, and the new Q4_0_x_x while i'm at it, I appreciate your efforts though but it's just easier for me to use my existing toolset :D

Also thanks to your input Q4_0 will be included by default so that GPT4All can always take my models (though Q8_0 is a very wise addition)

venketh changed pull request status to closed

Sign up or log in to comment