Multiple GPU supports

#79
by starmpcc - opened

The Zero-GPU document mentions 'Allow Spaces to run on multiple GPUs.'
Can I use multiple GPUs (e.g., run a large LLM with multiple GPUs such as Mixtral) with Zero-GPU?

Thank you

ZeroGPU Explorers org

I'm pretty sure your question is answered here; "This is achieved by making Spaces efficiently hold and release GPUs as needed (as opposed to a classical GPU Space that holds exactly one GPU at any point in time)"
Which is showcased in this gif
naVZI-v41zNxmGlhEhGDJ.gif
However I'm not quite sure you can create a mixtral demo with ZeroGPU, I could be wrong though. You may give it a try.

Thank you for your answer!
To execute Mixtral without quantization and offloading, 3 or 4 A100s are required, but it seems impossible with the current Zero GPU.
I think your explanation is correct.
Again, thank you!

starmpcc changed discussion status to closed
ZeroGPU Explorers org

You are welcome!
Though you could try to use Mixtral GGUF from TheBloke's Mixtral-8x7B-instruct-v0.1 Quants Or TheBloke's Dolphin-Mixtral-8x7B Quants, I'm sure you'll be able to find a space with ZeroGPU that runs GGUF Quants and play around with it.

Sign up or log in to comment