32 NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models · 19 authors 2
9 Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use · 13 authors 1
8 Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models · 6 authors 1