How much GPU Memory is needed?
Hi,
I am currently using the Mistral-7B-Instruct-v0.3-GGUF Q8 Variant with a total file size of 7.7 GB. My graphic card is a Nvidia RTX™ 4000 SFF Ada Generation (20GB GDDR6 ECC) .
Now my question is: This model (Mistral-Large-Instruct-2407-GGUF) seems much to large for my graphic card as the total file size even for the Q4 variant is 71,74GB.
Does this mean that I would need 4 x Nvidia RTX™ 4000 graphic cards to run this model on my server?
Thanks for help
Ralph
Hi @rsoika
If you want to load the entire model into your GPUs, you probably need more GPU cards. However, you can load as much as possible on your 1 GPU card, then the rest can go on the system RAM with CPUs. It will be much slower, but it should be possible.
What are you using to load and server Mistral-7B-Instruct-v0.3-GGUF Q8?