--leave-output-tensor !

#13
by ZeroWw - opened

all quants should also have an alternate version quantized using --leave-output-tensor

so we can see if that 30% bigger file has better performances...

Quant Factory org
edited 29 days ago

@ZeroWw would you like those for this model or any other specific model, and any specific sizes? Will try to include in future models

I made some tests,,, the model as of now that resists better to quantization is Mistral-7b-Instruct-v0.2
With llama-3-8b I am having horrible results even at q8_0.
Thanks for the offer though.

Sign up or log in to comment