Om AI Lab

Team

company

https://github.com/om-ai-lab

OmAI_lab

om-ai-lab

Activity Feed

AI & ML interests

Multimodal AI, VLM, VLA, VAM, etc

Recent Activity

P3ngLiu updated a model about 16 hours ago

omlab/VLX-Seek-1.5-10B

P3ngLiu updated a collection about 17 hours ago

VLX-Seek 1.5-Models

P3ngLiu published a model about 18 hours ago

omlab/VLX-Seek-1.5-10B

View all activity

Papers

Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models

VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs

View all Papers

Articles

VLX-Go: Vision-Language Short-Horizon Waypoint Prediction for Embodied Navigation

25 days ago

• 12

VLX-Seek: Improving VLM Fine-Grained Perception via Region Reference Instead of Coordinate Generation

26 days ago

• 14

VLX-Flow: Continuous Video Understanding for Real-Time Multimodal Interaction

27 days ago

• 15

Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning

Mar 25, 2025

• 3

Improving Object Detection through Reinforcement Learning with VLM-R1

Mar 25, 2025

• 4

View all articles

Organization Card

Community About org cards

Om AI Lab is a passionate group building multimodal foundation models for physical AI that reshape our work and life.

Collections 6

View 6 collections

spaces 5

Open Agent Leaderboard

🥇

Open Agent Leaderboard

VLM R1 Referral Expression

💬

Mark regions in images based on text descriptions

OmAgent

💬

Process and answer questions about webpage videos

VLM R1 OVD

👁

VLM-R1 model for Open-Vocabulary Object Detection

models 10

datasets 12

omlab/SARDet_REC6_NORM-FS

Viewer • Updated Feb 4 • 968 • 36

omlab/SARDet_REC6-FS

Viewer • Updated Feb 4 • 968 • 39

omlab/SARDet3-FS

Viewer • Updated Feb 1 • 270 • 21

omlab/Cross_DIOR-RSVG

Viewer • Updated Oct 2, 2025 • 7.42k • 19

omlab/Cross_RRSIS-D

Viewer • Updated Oct 2, 2025 • 3.48k • 126

omlab/VRSBench-FS

Viewer • Updated Oct 2, 2025 • 16.6k • 79 • 1

omlab/NWPU-FS

Viewer • Updated Oct 2, 2025 • 39 • 20

omlab/EarthReason-FS

Viewer • Updated Oct 2, 2025 • 3.39k • 104 • 1

omlab/VLM-R1

Preview • Updated Apr 23, 2025 • 475 • 18

omlab/RS5M

Viewer • Updated Mar 16, 2025 • 7.25M • 192 • 1

View 12 datasets

AI & ML interests

Recent Activity

Papers

Articles

VLX-Go: Vision-Language Short-Horizon Waypoint Prediction for Embodied Navigation

VLX-Seek: Improving VLM Fine-Grained Perception via Region Reference Instead of Coordinate Generation

VLX-Flow: Continuous Video Understanding for Real-Time Multimodal Interaction

Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning

Improving Object Detection through Reinforcement Learning with VLM-R1

Team members 3

Collections 6

spaces 5 Sort: Recently updated

Open Agent Leaderboard

VLM R1 Referral Expression

OmAgent

VLM R1 OVD

models 10 Sort: Recently updated

datasets 12 Sort: Recently updated

spaces 5

models 10

datasets 12