Can't get it to work on Runpod

by mfeldstein67 - opened Dec 17, 2023

Dec 17, 2023

I've been unable to get models over about 70B parameters to run on Runpod using webchat-UI—not GGUF or GPTQ—no matter what I try. I'm using multiple processors so it shouldn't be a hardware problem. My guess is that I'm not understanding something basic like how to fetch models that have been segmented due to file size or configure setups with multiple processors.

I just want to run for inferencing; I'm not trying to do anything fancy here.

Suggestions?

steampunk333

Feb 15

Total noob here, but the only way I could find to join split files was to use LM Studio to download the models; there are other ways, none of them worked for me or looked like they'd suck me into linux debugging hell.

Currently able to run this model at Q5_K_M on my local machine, so I know the download process works.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment