SlotVTG: Object-Centric Adapter for Generalizable Video Temporal Grounding Paper • 2603.25733 • Published Mar 26 • 1
Why Can't I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action Recognition Paper • 2601.16211 • Published 2 days ago • 3
Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs Paper • 2507.07990 • Published Jul 10, 2025 • 45