OC is not a multiple of cta_N = 64
#5
by
lazyDataScientist
- opened
Getting ValueError: OC is not a multiple of cta_N = 64
at the line out = awq_inference_engine.gemm_forward_cuda(x.reshape(-1, x.shape[-1]), self.qweight, self.scales, self.qzeroes, 8)
in the linear.py
file from the AutoAWQ
package. I am getting this error when loading the model in using Transformers.
My setup is:
python = 3.9.7
GPU = A100 40gb
Transformers = 4.36.2
AutoAWQ = 0.1.8
Same here, is there any solution?
Please use this model instead (TheBloke's is corrupted)
https://huggingface.co/casperhansen/mixtral-instruct-awq