Hub documentation
Spaces ZeroGPU: Dynamic GPU Allocation for Spaces
Spaces ZeroGPU: Dynamic GPU Allocation for Spaces

ZeroGPU is a shared infrastructure that optimizes GPU usage for AI models and demos on Hugging Face Spaces. It dynamically allocates and releases NVIDIA H200 GPUs as needed, offering:
- Free GPU Access: Enables cost-effective GPU usage for Spaces.
- Multi-GPU Support: Allows Spaces to leverage multiple GPUs concurrently on a single application.
Unlike traditional single-GPU allocations, ZeroGPU’s efficient system lowers barriers for developers, researchers, and organizations to deploy AI models by maximizing resource utilization and power efficiency.
Using and hosting ZeroGPU Spaces
- Using existing ZeroGPU Spaces
- ZeroGPU Spaces are available to use for free to all users. (Visit the curated list).
- PRO users get x5 more daily usage quota and highest priority in GPU queues when using any ZeroGPU Spaces.
- Hosting your own ZeroGPU Spaces
- Personal accounts: Subscribe to PRO to access ZeroGPU in the hardware options when creating a new Gradio SDK Space.
- Organizations: Subscribe to the Enterprise Hub to enable ZeroGPU Spaces for all organization members.
Technical Specifications
- GPU Type: Nvidia H200 slice
- Available VRAM: 70GB per workload
Compatibility
ZeroGPU Spaces are designed to be compatible with most PyTorch-based GPU Spaces. While compatibility is enhanced for high-level Hugging Face libraries like transformers
and diffusers
, users should be aware that:
- Currently, ZeroGPU Spaces are exclusively compatible with the Gradio SDK.
- ZeroGPU Spaces may have limited compatibility compared to standard GPU Spaces.
- Unexpected issues may arise in some scenarios.
Supported Versions
Gradio: 4+
PyTorch: Almost all versions from 2.1.0 to latest are supported
See full list
- 2.1.0
- 2.1.1
- 2.1.2
- 2.2.0
- 2.2.2
- 2.4.0
- 2.5.1
- 2.6.0
- 2.7.1
- 2.8.0
Python: 3.10.13
Getting started with ZeroGPU
To utilize ZeroGPU in your Space, follow these steps:
- Make sure the ZeroGPU hardware is selected in your Space settings.
- Import the
spaces
module. - Decorate GPU-dependent functions with
@spaces.GPU
.
This decoration process allows the Space to request a GPU when the function is called and release it upon completion.
Example Usage
import spaces
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained(...)
pipe.to('cuda')
@spaces.GPU
def generate(prompt):
return pipe(prompt).images
gr.Interface(
fn=generate,
inputs=gr.Text(),
outputs=gr.Gallery(),
).launch()
Note: The @spaces.GPU
decorator is designed to be effect-free in non-ZeroGPU environments, ensuring compatibility across different setups.
Duration Management
For functions expected to exceed the default 60-second of GPU runtime, you can specify a custom duration:
@spaces.GPU(duration=120)
def generate(prompt):
return pipe(prompt).images
This sets the maximum function runtime to 120 seconds. Specifying shorter durations for quicker functions will improve queue priority for Space visitors.
Dynamic duration
@spaces.GPU
also supports dynamic durations.
Instead of directly passing a duration, simply pass a callable that takes the same inputs as your decorated function and returns a duration value:
def get_duration(prompt, steps):
step_duration = 3.75
return steps * step_duration
@spaces.GPU(duration=get_duration)
def generate(prompt, steps):
return pipe(prompt, num_inference_steps=steps).images
Compilation
ZeroGPU does not support torch.compile
, but you can use PyTorch ahead-of-time compilation (requires torch 2.8+
)
Check out this blogpost for a complete guide on ahead-of-time compilation on ZeroGPU.
Hosting Limitations
- Personal accounts (PRO subscribers): Maximum of 10 ZeroGPU Spaces.
- Organization accounts (Enterprise Hub): Maximum of 50 ZeroGPU Spaces.
By leveraging ZeroGPU, developers can create more efficient and scalable Spaces, maximizing GPU utilization while minimizing costs.
Recommendations
If your demo uses a large model, we recommend using optimizations like ahead-of-time compilation and flash-attention 3. You can learn how to leverage these with ZeroGPU in this post. These optimizations will help you to maximize the advantages of ZeroGPU hours and provide a better user experience.
Feedback
You can share your feedback on Spaces ZeroGPU directly on the HF Hub: https://huggingface.co/spaces/zero-gpu-explorers/README/discussions
< > Update on GitHub