Spaces:
Running
on
Zero
What is the minimum Space Hardware to run this (cloned) Space?
Since this Space is running on ZeroGPU, which allows Spaces to run on multiple GPUs, I would like to know what the minimum Space Hardware is required to run this (cloned) Space as my request to access ZeroGPU has not granted, which I had made the request 1 week ago.
I do not mind to pay to run while I do not want to waste time and money to test out the minimum Space Hardware for running this (cloned) Space.
Hello ๐ I think it's roughly (7B LM and I think less than 1B vision tower and projector) params, running on float16 which you should be able to run on V100 easily, and maybe you can reduce memory constraints during inference (not storage) you could load the model in 8-bit or 4-bit in a T4 @KHCHEUNG-UoSHK @fcakyon sorry for the late response
Also note that 4/8-bit aren't native to nvidia hardware, so under the hood they cast forth and back from bf16 or float32, which results in slight decrease in latency, but it makes it easier to work with T4
Thank you for your response and for providing this fantastic space!