InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published 22 days ago • 92
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases Paper • 2412.04862 • Published 29 days ago • 48
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published 29 days ago • 105