auffusion commited on
Commit
6b038b2
β€’
1 Parent(s): da9b5ff

first commit

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -10,7 +10,9 @@ tags:
10
  **Auffusion** is a latent diffusion model (LDM) for text-to-audio (TTA) generation. **Auffusion** can generate realistic audios including human sounds, animal sounds, natural and artificial sounds and sound effects from textual prompts. We introduce Auffusion, a TTA system adapting T2I model frameworks to TTA task, by effectively leveraging their inherent generative strengths and precise cross-modal alignment. Our objective and subjective evaluations demonstrate that Auffusion surpasses previous TTA approaches using limited data and computational resource. We release our model, inference code, and pre-trained checkpoints for the research community.
11
 
12
  πŸ“£ We are releasing **Auffusion-Full-no-adapter** which was pre-trained on all datasets described in paper and created for easy use of audio manipulation.
 
13
  πŸ“£ We are releasing **Auffusion-Full** which was pre-trained on all datasets described in paper.
 
14
  πŸ“£ We are releasing **Auffusion** which was pre-trained on **AudioCaps**.
15
 
16
  ## Auffusion Model Family
 
10
  **Auffusion** is a latent diffusion model (LDM) for text-to-audio (TTA) generation. **Auffusion** can generate realistic audios including human sounds, animal sounds, natural and artificial sounds and sound effects from textual prompts. We introduce Auffusion, a TTA system adapting T2I model frameworks to TTA task, by effectively leveraging their inherent generative strengths and precise cross-modal alignment. Our objective and subjective evaluations demonstrate that Auffusion surpasses previous TTA approaches using limited data and computational resource. We release our model, inference code, and pre-trained checkpoints for the research community.
11
 
12
  πŸ“£ We are releasing **Auffusion-Full-no-adapter** which was pre-trained on all datasets described in paper and created for easy use of audio manipulation.
13
+
14
  πŸ“£ We are releasing **Auffusion-Full** which was pre-trained on all datasets described in paper.
15
+
16
  πŸ“£ We are releasing **Auffusion** which was pre-trained on **AudioCaps**.
17
 
18
  ## Auffusion Model Family