What kind of GPU need to run this model locally on-prem ?

#17
by eliastick - opened

I'd like to run this model on-premise . What hardware and GPU I need ? Thank you

I'd like to run this model on-premise . What hardware and GPU I need ? Thank you

Try GGUF quants in llama.cpp or kobold.cpp. I recommend llama.cpp, since I've experienced issues with kobold.cpp due to image resizing.
I can run GGUFs on Tesla P40, many people claim that they managed to fit 34b Q4 quants in 7900XTX, so 24GB VRAM is probably a minimum system requirement.

Sign up or log in to comment