zero-gpu-explorers/README · Zero-GPU Quota etc

perler

ZeroGPU Explorers org Mar 1, 2024

Hi,

is there any documentation about the quota settings and similar?

Some open questions I ran into:

How much GPU time can I request at once? 3 minutes seem to work, 10 not.
Is my quota combined for all of my spaces? For running GPU time, independent of users?
How quickly does quota recharge? Seems around 1 s GPU per 1 min waiting
Can I get my remaining quota or do I need to catch the exception?
Can I request more quota for some time, e.g. around a presentation at a conference?

cbensimon

ZeroGPU Explorers org Mar 1, 2024

Hi @perler thanks for your interest in ZeroGPU. Official documentation will come at some point but in the meantime I'll try to accurately answer your questions

5 minutes is the maximum time that can be requested at once
Quotas are only applied to visitors, they do not depend on the Space. As a Space author you are subject to the same quotas than arbitrary visitors
Quotas have a half-life of 2h. It means that what you used count half after 2h (precision is of course down to the second)
For now you need to catch the exception. We might display quotas on the hub one day (let me know if you think that it would be an important feature)
We currently do not have mechanisms for doing this (if your attendees do not connect on the same WiFi and use their own mobile connection you shouldn't have quota issues. You get it, quotas are IP-based)

perler

ZeroGPU Explorers org Mar 2, 2024

•

edited Mar 2, 2024

@cbensimon thank you! That already helps a lot.

I have a few more questions, this time more spaces-related:

How large is the pool of GPUs for Zero-GPU roughly? All A10g?
Is there documentation of the HF Spaces backend? E.g. concerning
1. available hardware,
2. drivers,
3. settings,
4. default environment variables,
5. shared hard drive,
6. virtualization,
7. backend package versions, e.g. docker
I can test most commits locally with my own docker setup. But Spaces has some quirks that I can only work out with trial-and-error. A development system for quicker building and testing would be great. One example: I tried calling my NN reconstruction directly within Python from the main process. Spaces didn't allow me to spawn the worker processes for the data loader. "daemonic processes are not allowed to have children". I guess that makes sense but there is no way, I would have tested that before pushing.

Anyway, thank you for HF. It surely helps with getting academic results to the people.

cbensimon

ZeroGPU Explorers org Mar 19, 2024

ZeroGPU recently migrated to Nvidia A100 and runs on a couple hundreds of them
No official documentation for now. Your Space runs in a containerized environment but specs are not stabilized at the moment
"daemonic processes are not allowed to have children" --> Probably comes from the fact that @spaces.GPU is effect-free outside of a real ZeroGPU Space environment, so you don't get the error in your dev environment (the function is just called, while it is run in a subprocess on ZeroGPU). We'll soon disable daemon=True on ZeroGPU workers as well as provide a local ZeroGPU emulation mode (very basic but still should allow catching a lot more errors)

Anyway, thank you for HF. It surely helps with getting academic results to the people.

Thank you for this 🤗

perler

ZeroGPU Explorers org Mar 19, 2024

thanks!

perler changed discussion status to closed Mar 19, 2024

PandaArtStation

ZeroGPU Explorers org Mar 19, 2024

very grateful for the opportunity!

I have a few questions:

...you write that the maximum time is 5 minutes, but processes running on zero gpu for more than a minute are terminated with an error - how can we use at least part of these conditional 5 minutes?

I see my task in experiments with generative neural network settings and a few extra minutes of calculations will greatly advance this work.

...only 10 spaces per 1 account

I understand 10 public spaces running 24/7, but who would be bothered by sleeping private spaces?
...there are a lot of neural nets in my project and I have to delete spaces to try them out

Can't you do something about the overall design? (for example)

a common panel for starting and stopping spaces, indication of workability
a loading indicator for public spaces
project design options
statistics of using your spaces
permanent storage of test images from projects together with settings (something like a notebook inside the system).

perler

ZeroGPU Explorers org Mar 23, 2024

About 1.:
You can specify the estimated duration in the GPU decorator, e.g. this for 3 minutes

@spaces.GPU(duration=60 * 3)
def run_on_gpu(...)
   ....

Don't know about the other questions.

cbensimon

ZeroGPU Explorers org Mar 29, 2024

•

edited Mar 29, 2024

Will be soon documented but yes, @perler answered right, thanks for this!
Sure, for now we took very simple measures to prevent mass abuse but yes it will be more fine-grained in the future (like having different limits for sleeping vs. alive ZeroGPU Spaces)
We're working actively on making Spaces a better platform for the community. Taking your ideas as feedbacks! (Space statistics already exist at the very bottom of the Settings tab)
(by the way if you can elaborate on "a loading indicator for public spaces", I'd be curious @PandaArtStation )

PandaArtStation

ZeroGPU Explorers org Mar 29, 2024

(by the way if you can elaborate on "a loading indicator for public spaces", I'd be curious @PandaArtStation )

The idea is roughly as follows:

Take a model for generating pictures, experiment with settings and various additions, and put it out into the public domain after the testing phase is complete

...To understand how interested users are in this modification of the basic model, we need statistics of its use and something like a microforum specifically for this space.

cbensimon

ZeroGPU Explorers org Apr 22, 2024

To understand how interested users are in this modification of the basic model, we need statistics of its use
We'll soon have public statistics (total GPU runs, total GPU seconds) on ZeroGPU!

mahiatlinux

ZeroGPU Explorers org Apr 26, 2024

To understand how interested users are in this modification of the basic model, we need statistics of its use
We'll soon have public statistics (total GPU runs, total GPU seconds) on ZeroGPU!

Has ZeroGPU moved to A10G spaces?

mrfakename

ZeroGPU Explorers org Apr 26, 2024

It’s now A100

jerpint

Jun 16, 2024

•

edited Jun 16, 2024

is there a recommended way for debugging without quota restrictions as the author of a space with zeroGPU? I just paid to get access to the zeroGPU, and ran out very quickly because I'm in the process of figuring out my app settings. Right now I am told to try again in about 4h30 , which is a lot of time to wait before I can continue debugging 😅

PandaArtStation

ZeroGPU Explorers org Jul 12, 2024

Couldn't we change the reporting period from a few hours to a day for example? then the gpu limit will stop interfering with the research by itself

Moibe

about 1 month ago

@cbensimon

Quotas have a half-life of 2h. It means that what you used count half after 2h (precision is of course down to the second)

Did this change recently? Before, the seconds regenerated every certain minutes, but I have noticed that it resets just at 1 hour, maybe at 00:00 but I'm not sure, what is the rule.