a zero-GPU Hugging Face space for testing?
Hi Team, please create a zero-GPU Hugging Face space for testing.
Thanks for the suggestion! Unfortunately ZeroGPU won't work well for Scenema Audio. The pipeline loads multiple models in sequence (Gemma 3 12B text encoder, audio diffusion transformer,
MelBandRoFormer for vocal separation, SeedVC for voice identity transfer), and the model loading alone would exceed HuggingFace's 60-second execution limit for ZeroGPU spaces.
The easiest way to try it right now is to sign up for free at scenema.ai. Start a conversation, and the director agent will generate voiceover prompts for you. You can select Scenema Audio from
the voice model dropdown and do voice design directly in the platform.
For self-hosting, the GitHub repo has a Docker setup that handles all the model management automatically.
If you need help or have questions, join us on Discord: https://discord.com/invite/EEErCFsnzw