Spaces:

Chaerin5
/

FoundHand

Runtime error

Apply for community grant: Academic project (gpu)

by Chaerin5 - opened Dec 26, 2024

Owner Dec 26, 2024

We propose a large generative model for hands. This includes generating hand images given hand poses and fixing malformed AI generated hands, and even more. But we would need gpu to host this service. More information can be found at our paper https://arxiv.org/abs/2412.02690 and website https://ivl.cs.brown.edu/research/foundhand.html Thank you!

hysts

Dec 26, 2024

Hi @Chaerin5 , we've assigned ZeroGPU to this Space. Please check the compatibility and usage sections of this page so your Space can run on ZeroGPU.

Chaerin5

Owner Dec 27, 2024

@hysts Thank you so much for the ZeroGPU! We are trying to setup the space with ZeroGPU, but it gives me this error.
GPU task aborted
We already tried duration=120 and did .to("cuda"), but didn't work. Could you please help me with this?🥺

hysts

Dec 28, 2024

@Chaerin5 The error is raised when the function decorated with @spaces.GPU takes longer than the specified duration. What's the expected execution time? Is it longer than 120 seconds?

hysts

Dec 28, 2024

Looks like the error is raised after 10 seconds or something when clicking the crop button in 1. As this Space is using private models, I can't check it on my end, but you might want to check where exactly the error is raised first.

Chaerin5

Owner Dec 28, 2024

@hysts Thanks for your comments! I fixed the GPU task aborted error. It looks like the Mediapipe was not compatible with the ZeroGPU.

But I got another problem at this line of my code:
If I do image.to("cuda"), it give the following error
CUDA must not be initialized in the main process on Spaces with Stateless GPU environment
However, if I don't do image.to("cuda"), it gives me the error below.
Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

What is the best practice of moving the input tensors to the cuda device? Could you please help me with this?

hysts

Dec 28, 2024

@Chaerin5
Nice! Thanks for looking into it.

As for the new error, I'm not 100% sure, but I think that's because you are decorating a function defined inside another function.
Can you try moving https://huggingface.co/spaces/Chaerin5/FoundHand/blob/32ecde5ddd92139a4c9c08320874c9866dd075be/app.py#L325-L332 outside of https://huggingface.co/spaces/Chaerin5/FoundHand/blob/32ecde5ddd92139a4c9c08320874c9866dd075be/app.py#L259 ?

Chaerin5

Owner Dec 29, 2024

It works right now. Thank you so much for your support!

Chaerin5 changed discussion status to closed Dec 29, 2024

Chaerin5 changed discussion status to open Apr 4

Chaerin5

Owner Apr 4

Hi @hysts

Thank you for helping us out last Dec. We are currently improving our spaces further for more impacts!

In the meantime, I'd like to ask you about more advanced plans. From my undersatnding, ZeroGPU can support up to 120s. But we might need to run a function up to 3-5 mins. Is there any way to work around with this, or are there any plans available for longer gpu usage? We might be willing to pay to some extent, but not something entrepreneur level which I think is based on per minute, per user. Moreover, something that is faster than A100 would be so great.

I will highly appreciate any information you could share us. Thank you a lot!!

hysts

Apr 5

Hi @Chaerin5 Thanks for your question!

I wanted to clarify a small misunderstanding regarding ZeroGPU. With ZeroGPU, you can actually specify longer duration than 120 seconds. However, each user has a ZeroGPU quota, and if the specified duration exceeds their remaining quota, they won't be able to run it until their quota is refreshed. Currently, the quota for logged-in users is 5 minutes per day, so in practice, it’s recommended to keep it less than 300 seconds.

That said, if your function takes 3–5 minutes per run, we generally recommend creating a CPU-based Space that users can duplicate and run on their own paid hardware. The reason is that if each inference takes 5 minutes, even with a GPU assigned, you'd only get about 12 runs per hour, which leads to a very long queue with significant wait times and limited usability for most users.

Unfortunately, there are no other options besides dedicated hardware or ZeroGPU.

Chaerin5

Owner Apr 5

Ok, thank you for your information!

Chaerin5 changed discussion status to closed Apr 26

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment