OC is not a multiple of cta_N = 64

#5
by lazyDataScientist - opened

Getting ValueError: OC is not a multiple of cta_N = 64 at the line out = awq_inference_engine.gemm_forward_cuda(x.reshape(-1, x.shape[-1]), self.qweight, self.scales, self.qzeroes, 8) in the linear.py file from the AutoAWQ package. I am getting this error when loading the model in using Transformers.

My setup is:
python = 3.9.7
GPU = A100 40gb
Transformers = 4.36.2
AutoAWQ = 0.1.8

Same here, is there any solution?

Please use this model instead (TheBloke's is corrupted)
https://huggingface.co/casperhansen/mixtral-instruct-awq

Sign up or log in to comment