VAE version of the Descript Audio Codec, which has a continuous latent space. Descript Audio Codec (DAC) is a high fidelity general neural audio codec, introduced in the paper titled High-Fidelity Audio Compression with Improved RVQGAN. Most code is adopted from the open-source repo DAC

According to the Semantic-VAE paper, this semantic distillation approach improves the training efficiency and performance of downstream TTS models. Furthermore, by reducing the latent dimension to 32, this new variant enables even lighter and faster training for these downstream tasks without sacrificing much audio quality.

Thanks to

facebook/dacvae-watermarked

Aratako/Semantic-DACVAE-Japanese-32dim

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

sherif1313
/

DACVAE-Arabic-32dim

Spaces using sherif1313/DACVAE-Arabic-32dim 3

Collection including sherif1313/DACVAE-Arabic-32dim

3arab-tts