Abstract
Kairos is a native world model framework that learns from diverse experiences, maintains persistent states through hybrid temporal attention, and supports efficient deployment for physical AI applications.
World models are transitioning from passive visual generators to foundational, operational infrastructure for Physical AI: they must natively acquire world knowledge from heterogeneous experience, maintain persistent states over long horizons, and execute efficiently within real deployment constraints. We introduce Kairos, a native world model stack designed around these requirements. (1) Kairos learns the world by pioneering a Native Pre-training Paradigm governed by a Cross-Embodiment Data Curriculum, which organizes open-world videos, human behavioral data, and robot interactions into a progressive developmental pathway. (2) Kairos maintains the world by unified world understanding, generation, and prediction within a Native Unified Architecture equipped with Hybrid Linear Temporal Attention, where sliding-window attention captures local dynamics, dilated sliding windows capture mid-range dependencies, and gated linear attention maintains persistent global memory. We establish formal theoretical bounds demonstrating that this temporal factorization strictly limits error accumulation, mathematically guaranteeing state propagation across extended horizons. (3) Kairos runs the world by incorporating a Deployment-Aware System Co-Design to support low-latency rollout generation on server and consumer-grade hardware for real-world observation-action-feedback loops. Experiments on embodied world-model, long-horizon, and action-policy benchmarks show that Kairos achieves top level performance while offering a strong efficiency-capability trade-off. Together, these results position Kairos as a cohesive operational foundation for future self-evolving physical intelligence.
Community
Kairos Technical Report
Interesting breakdown of this paper on arXivLens: https://arxivlens.com/PaperView/Details/kairos-a-native-world-model-stack-for-physical-ai-3565-d2787c60
Covers the executive summary, detailed methodology, and practical applications.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- World Action Models: The Next Frontier in Embodied AI (2026)
- MotuBrain: An Advanced World Action Model for Robot Control (2026)
- Qwen-RobotWorld Technical Report: Unifying Embodied World Modeling through Language-Conditioned Video Generation (2026)
- HiMem-WAM: Hierarchical Memory-Gated World Action Models for Robotic Manipulation (2026)
- AHA-WAM:Asynchronous Horizon-Adaptive World-Action Modeling with Observation-Guided Context Routing (2026)
- Cosmos 3: Omnimodal World Models for Physical AI (2026)
- MolmoAct2: Action Reasoning Models for Real-world Deployment (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2606.16533 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper