CPU Inference

#13

by Ange09 - opened Jul 24, 2023

Ange09

Jul 24, 2023

Hello TheBloke,
Is there any way to perform inference on CPU with the model?
Thank you very much.

Owner Jul 24, 2023

Technically yes you can run GPTQ on CPU but it's horribly slow.

If you want CPU only inference, use the GGML versions found in https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment