OminiControl: Minimal and Universal Control for Diffusion Transformer Paper • 2411.15098 • Published 3 days ago • 21
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Paper • 2411.13543 • Published 5 days ago • 13
VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement Paper • 2411.15115 • Published 3 days ago • 5
Style-Friendly SNR Sampler for Style-Driven Generation Paper • 2411.14793 • Published 4 days ago • 28
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published 3 days ago • 38
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models Paper • 2411.14982 • Published 3 days ago • 11
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection Paper • 2411.14794 • Published 4 days ago • 9
Efficient Long Video Tokenization via Coordinated-based Patch Reconstruction Paper • 2411.14762 • Published 4 days ago • 9
One to rule them all: natural language to bind communication, perception and action Paper • 2411.15033 • Published 3 days ago • 2
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published 5 days ago • 34
DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding Paper • 2411.14347 • Published 4 days ago • 8
Multimodal Autoregressive Pre-training of Large Vision Encoders Paper • 2411.14402 • Published 4 days ago • 36
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Paper • 2411.14405 • Published 4 days ago • 46
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published 10 days ago • 57
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs Paper • 2411.14199 • Published 4 days ago • 23
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models Paper • 2411.14432 • Published 4 days ago • 18
Stable Flow: Vital Layers for Training-Free Image Editing Paper • 2411.14430 • Published 4 days ago • 11