--leave-output-tensor !

#13

by ZeroWw - opened about 1 month ago

Discussion

ZeroWw

about 1 month ago

all quants should also have an alternate version quantized using --leave-output-tensor

so we can see if that 30% bigger file has better performances...

munish0838

Quant Factory org 29 days ago

•

edited 29 days ago

@ZeroWw would you like those for this model or any other specific model, and any specific sizes? Will try to include in future models

ZeroWw

26 days ago

I made some tests,,, the model as of now that resists better to quantization is Mistral-7b-Instruct-v0.2
With llama-3-8b I am having horrible results even at q8_0.
Thanks for the offer though.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment