SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published 15 days ago • 168
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published Nov 26, 2024 • 87
Improving Vision-Language-Action Model with Online Reinforcement Learning Paper • 2501.16664 • Published Jan 28 • 1
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse Paper • 2503.16365 • Published Mar 20 • 40
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey Paper • 2503.12605 • Published Mar 16 • 34
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation Paper • 2501.16764 • Published Jan 28 • 22
VideoRAG: Retrieval-Augmented Generation over Video Corpus Paper • 2501.05874 • Published Jan 10 • 72
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published Jan 10 • 66
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images Paper • 2501.04689 • Published Jan 8 • 17 • 5
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 277 • 42
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 277
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images Paper • 2501.04689 • Published Jan 8 • 17 • 5
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images Paper • 2501.04689 • Published Jan 8 • 17
DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization Paper • 2501.03271 • Published Jan 5 • 11
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published Dec 31, 2024 • 48
Molar: Multimodal LLMs with Collaborative Filtering Alignment for Enhanced Sequential Recommendation Paper • 2412.18176 • Published Dec 24, 2024 • 16
Health AI Developer Foundations (HAI-DEF) Collection Groups models released for use in health AI by Google. Read more about HAI-DEF at https://developers.google.com/health-ai-developer-foundations • 10 items • Updated 8 days ago • 38