V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper โข 2504.06148 โข Published 8 days ago โข 12
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper โข 2504.06148 โข Published 8 days ago โข 12
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper โข 2504.06148 โข Published 8 days ago โข 12 โข 2
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper โข 2503.20198 โข Published 22 days ago โข 4
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper โข 2503.20198 โข Published 22 days ago โข 4
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper โข 2503.20198 โข Published 22 days ago โข 4 โข 3
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation Paper โข 2503.20672 โข Published 21 days ago โข 13 โข 3
Automated Movie Generation via Multi-Agent CoT Planning Paper โข 2503.07314 โข Published Mar 10 โข 43
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles Paper โข 2503.03651 โข Published Mar 5 โข 16
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models Paper โข 2503.01774 โข Published Mar 3 โข 43
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data Paper โข 2502.14397 โข Published Feb 20 โข 41
WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation Paper โข 2502.08047 โข Published Feb 12 โข 27
TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation Paper โข 2502.07870 โข Published Feb 11 โข 44 โข 2
TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation Paper โข 2502.07870 โข Published Feb 11 โข 44