MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published 21 days ago • 83
ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement Paper • 2504.01934 • Published 20 days ago • 23
PaperBench: Evaluating AI's Ability to Replicate AI Research Paper • 2504.01848 • Published 20 days ago • 36
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks Paper • 2504.01308 • Published 20 days ago • 13
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance Paper • 2504.01724 • Published 20 days ago • 64
Articulated Kinematics Distillation from Video Diffusion Models Paper • 2504.01204 • Published 21 days ago • 24