GPU memory usage/requirement?

#1
by Bilibili - opened

Thanks for this work!

Since the original StarCoder requires 60+ GB GPU RAM for inference, I wonder what about the GPTQ version, and could the model run inference on V100-32G?

Bilibili changed discussion title from GPU memory usage peak? to GPU memory usage requirement?
Bilibili changed discussion title from GPU memory usage requirement? to GPU memory usage/requirement?

I'm totally new to GPTQ and am not exactly sure how to calculate the exacts, but it seems happy with 20-30 gigs from my CPU's ram, and I have only 12 gigs used in my GPU.

Yes 32GB is more than enough VRAM for nearly any model in GPTQ. This one needs around 12GB yeah

Sign up or log in to comment