RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

by LaferriereJC - opened May 18, 2023

May 18, 2023

trying to run on cpu

model = AutoGPTQForCausalLM.from_quantized(
model_repo,
device=device,
use_safetensors=True,
use_triton=device != "cpu", # comment/remove if not on Linux
).to(device).to(torch.float32)
"""
model = AutoGPTQForCausalLM.from_quantized(
model_repo,
device=device,
use_safetensors=True,
use_triton=device != "cpu", # comment/remove if not on Linux
).to(device)
"""

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

pszemraj

Analytics Club at ETH Zürich org May 18, 2023

Hi! thanks for raising this and I'm totally on board - auto-GPTQ does not seem to work on CPU at the moment. I have an issue open for this problem on the repo here, it would be awesome if you could also post this there so it gets more attention :)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment