RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints Paper • 2503.16408 • Published Mar 20 • 40
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning Paper • 2501.04698 • Published Jan 8 • 15
WorldSimBench: Towards Video Generation Models as World Simulators Paper • 2410.18072 • Published Oct 23, 2024 • 20
F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions Paper • 2407.12435 • Published Jul 17, 2024 • 14
MotionLLM: Understanding Human Behaviors from Human Motions and Videos Paper • 2405.20340 • Published May 30, 2024 • 21
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension Paper • 2404.16790 • Published Apr 25, 2024 • 9