Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models Paper • 2504.15271 • Published 1 day ago • 53
Frozen Transformers in Language Models Are Effective Visual Encoder Layers Paper • 2310.12973 • Published Oct 19, 2023 • 1
Situational Awareness Matters in 3D Vision Language Reasoning Paper • 2406.07544 • Published Jun 11, 2024 • 1
ReferEverything: Towards Segmenting Everything We Can Speak of in Videos Paper • 2410.23287 • Published Oct 30, 2024 • 19
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding Paper • 2409.03757 • Published Sep 5, 2024 • 2
Situational Awareness Matters in 3D Vision Language Reasoning Paper • 2406.07544 • Published Jun 11, 2024 • 1
Frozen Transformers in Language Models Are Effective Visual Encoder Layers Paper • 2310.12973 • Published Oct 19, 2023 • 1
Floating No More: Object-Ground Reconstruction from a Single Image Paper • 2407.18914 • Published Jul 26, 2024 • 19
Floating No More: Object-Ground Reconstruction from a Single Image Paper • 2407.18914 • Published Jul 26, 2024 • 19