Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper • 2412.18619 • Published Dec 16, 2024 • 55
LLaMA-Omni: Seamless Speech Interaction with Large Language Models Paper • 2409.06666 • Published Sep 10, 2024 • 57