What GPU recommendations for this?

#1
by hazrpg - opened

Hey, thanks so much for releasing these.

I currently have a 3060 12GB and it fails to load - it fills up the available 11GB (presumably ~1GB is used for the OS) and then pytorch errors out with out of memory.

I might try the GGUF versions for now - but I'm curious what your recommendations are for this model, how much vRAM, etc.

It should only take like 6gb vram? Infact you could probably load a 13b model as well. What are you using to load the model?

If you're using vLLM, it pre-computes and reserves the necessary memory for cache and prompt conversions. So, a 6gb model may actually reserve >12gb of vram.

Sign up or log in to comment