StdGEN: Semantic-Decomposed 3D Character Generation from Single Images Paper • 2411.05738 • Published 12 days ago • 13
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning Paper • 2411.02337 • Published 16 days ago • 36
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents Paper • 2410.23218 • Published 21 days ago • 46
ReferEverything: Towards Segmenting Everything We Can Speak of in Videos Paper • 2410.23287 • Published 21 days ago • 17
D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement Paper • 2410.13842 • Published Oct 17 • 1
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Paper • 2409.17146 • Published Sep 25 • 102
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting Paper • 2410.17856 • Published 28 days ago • 49
Unbounded: A Generative Infinite Game of Character Life Simulation Paper • 2410.18975 • Published 27 days ago • 34
Animate-X: Universal Character Image Animation with Enhanced Motion Representation Paper • 2410.10306 • Published Oct 14 • 52
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models Paper • 2410.07985 • Published Oct 10 • 26
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Paper • 2409.02634 • Published Sep 4 • 89
Transformer Explainer: Interactive Learning of Text-Generative Models Paper • 2408.04619 • Published Aug 8 • 155
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining Paper • 2408.02657 • Published Aug 5 • 32
Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion Paper • 2408.00458 • Published Aug 1 • 10