VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning Paper • 2504.06958 • Published 14 days ago • 10
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment Paper • 2412.19326 • Published Dec 26, 2024 • 18