MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation Paper • 2503.14428 • Published 14 days ago • 8
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation Paper • 2503.14428 • Published 14 days ago • 8
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 123
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data Paper • 2410.18558 • Published Oct 24, 2024 • 20
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation Paper • 2401.09732 • Published Jan 18, 2024
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization Paper • 2502.13922 • Published Feb 19 • 25
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 367
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper • 2501.09751 • Published Jan 16 • 48
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper • 2501.08828 • Published Jan 15 • 31