GPTQ Quantitative Support

#2
by warlock-edward - opened

I am very interested in your model, but since my GPU is a V100, I don't have enough memory to run your model, and it seems that the V100 can only be used in the GPTQ way instead of AWQ, so I expect you to provide a GPTQ quantized version of the model, and I don't know if it's possible!

Sign up or log in to comment