能分享下怎么用gptq量化成int4的吗?

#7
by loong - opened

这边参考官网量化qwen,报错RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

model.cuda() 即可

https://github.com/PanQiWei/AutoGPTQ/issues/370#issuecomment-1766913012

看这里,autogptq内部做device迁移的时候有些不完善的地方。

jklj077 changed discussion status to closed

Sign up or log in to comment