Spaces:
Running
on
Zero
Apply for community grant: Academic project
TANGO is a latent diffusion model (LDM) for text-to-audio (TTA) generation. TANGO can generate realistic audios including human sounds, animal sounds, natural and artificial sounds, and sound effects from textual prompts. We use the frozen instruction-tuned LLM Flan-T5 as the text encoder and train a UNet-based diffusion model for audio generation. We perform comparably to current state-of-the-art models for TTA across both objective and subjective metrics, despite training the LDM on a 63 times smaller dataset. We release our model, training, inference code, and pre-trained checkpoints for the research community.
We hereby request HF to give us GPUs to run this space.
Meanwhile, we are also reaching out to companies to provide us with GPU resources for training Tango on larger datasets such as AudioSet so that Tango could generate a wider range of audios.
-Team Tango
Dear HF,
Let us know the status of our application, please. Thanks.
-Team Tango
Hi @soujanyaporia , we have assigned a gpu to this space. Note that GPU Grants are provided temporarily and might be removed after some time if the usage is very low.
To learn more about GPUs in Spaces, please check out https://huggingface.co/docs/hub/spaces-gpus
Hi @akhaliq A zillion thanks :)