-
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like
Paper • 2402.07383 • Published • 13 -
Matcha-TTS: A fast TTS architecture with conditional flow matching
Paper • 2309.03199 • Published • 11 -
Natural language guidance of high-fidelity text-to-speech with synthetic annotations
Paper • 2402.01912 • Published • 11 -
Fast Timing-Conditioned Latent Audio Diffusion
Paper • 2402.04825 • Published • 7
RO-HOON OH
heiscold
·
AI & ML interests
TTS, Audio Editing, Speech Editing
Organizations
None yet
Collections
5
-
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models
Paper • 2402.06178 • Published • 13 -
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
Paper • 2401.12179 • Published • 20 -
Fast Timing-Conditioned Latent Audio Diffusion
Paper • 2402.04825 • Published • 7 -
Brain2Music: Reconstructing Music from Human Brain Activity
Paper • 2307.11078 • Published • 40
models
None public yet
datasets
None public yet