How much GPU Memory is needed?

#14
by rsoika - opened

Hi,
I am currently using the Mistral-7B-Instruct-v0.3-GGUF Q8 Variant with a total file size of 7.7 GB. My graphic card is a Nvidia RTX™ 4000 SFF Ada Generation (20GB GDDR6 ECC) .

Now my question is: This model (Mistral-Large-Instruct-2407-GGUF) seems much to large for my graphic card as the total file size even for the Q4 variant is 71,74GB.
Does this mean that I would need 4 x Nvidia RTX™ 4000 graphic cards to run this model on my server?

Thanks for help

Ralph

Hi @rsoika

If you want to load the entire model into your GPUs, you probably need more GPU cards. However, you can load as much as possible on your 1 GPU card, then the rest can go on the system RAM with CPUs. It will be much slower, but it should be possible.

What are you using to load and server Mistral-7B-Instruct-v0.3-GGUF Q8?

Sign up or log in to comment