Maybe need requant and IQ3_S models?

#1
by Cran-May - opened

as title.

IQ3_S is just fit for 4GB VRAM devices running 8B models.(IQ3_M is best for 7B models.)

I'd need to try to redo these quants in the latest llamacpp and if do I'll include the IQ3_S.

These will be reuploaded with the new llamacpp version.

Lewdiculous changed discussion status to closed

Sign up or log in to comment