Shao's picture

10

Shao

Castielll

·

AI & ML interests

None yet

Organizations

None yet

Castielll's activity

upvoted 3 papers 8 months ago

Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming

Paper • 2408.16725 • Published Aug 29, 2024 • 54

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 61

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Paper • 2408.05211 • Published Aug 9, 2024 • 49

upvoted 3 papers 9 months ago

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 115

PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems

Paper • 2406.12428 • Published Jun 18, 2024 • 1

Stable Audio Open

Paper • 2407.14358 • Published Jul 19, 2024 • 27

upvoted 4 papers about 1 year ago

Long-form music generation with latent diffusion

Paper • 2404.10301 • Published Apr 16, 2024 • 28

Audio Dialogues: Dialogues dataset for audio and music understanding

Paper • 2404.07616 • Published Apr 11, 2024 • 16

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Paper • 2403.05525 • Published Mar 8, 2024 • 45

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

Paper • 2404.05674 • Published Apr 8, 2024 • 15