Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning
•
1
Multimodal AI, Agents
Om AI Lab is a passionate group building multimodal AI agents that reshape our work and life.
Open Agent Leaderboard
Process and answer questions about webpage videos
VLM-R1 model for Open-Vocabulary Object Detection
Highlight described objects in images