GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning Paper • 2602.12099 • Published 3 days ago • 49
PhyCritic: Multimodal Critic Models for Physical AI Paper • 2602.11124 • Published 4 days ago • 50
RADIO Collection A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.). • 19 items • Updated 4 days ago • 32
google/siglip2-giant-opt-patch16-384 Zero-Shot Image Classification • 2B • Updated Feb 21, 2025 • 119k • 35
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning Paper • 2601.21468 • Published 18 days ago • 21
llm-semantic-router/multi-modal-embed-small Sentence Similarity • Updated 10 days ago • 113 • 14
DocReward: A Document Reward Model for Structuring and Stylizing Paper • 2510.11391 • Published Oct 13, 2025 • 27
OpenMed/Medical-Reasoning-SFT-Nemotron-Nano-30B Viewer • Updated 12 days ago • 445k • 939 • 43
TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers Paper • 2601.14133 • Published 26 days ago • 60