Fast image relighting using Latent Bridge Matching
Conversational speech generation
Convert voices using reference audio
A text-to-speech model powered by SparkAudio and Mobvoi.
Blazingly Fast and Embarrassingly Simple Song Generation