Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution Paper • 2409.12961 • Published 15 days ago • 23
Transformer Explainer: Interactive Learning of Text-Generative Models Paper • 2408.04619 • Published Aug 8 • 153
LLaVa-NeXT-Video Collection LLaVa-NeXT-Video extends LLaVa-NeXT for video understanding. • 5 items • Updated Jun 10 • 4
LLaVA-Video Collection Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 6 items • Updated about 2 hours ago • 40