GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs Paper • 2412.11258 • Published Dec 15, 2024 • 13
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator Paper • 2412.12094 • Published Dec 16, 2024 • 10
ColorFlow: Retrieval-Augmented Image Sequence Colorization Paper • 2412.11815 • Published Dec 16, 2024 • 26
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models Paper • 2412.09645 • Published Dec 10, 2024 • 35
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 88
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation Paper • 2412.11919 • Published Dec 16, 2024 • 33
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models Paper • 2412.11605 • Published Dec 16, 2024 • 17
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations Paper • 2412.12083 • Published Dec 16, 2024 • 12
FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers Paper • 2412.09611 • Published Dec 12, 2024 • 9
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing Paper • 2412.07517 • Published Dec 10, 2024 • 11
ObjectMate: A Recurrence Prior for Object Insertion and Subject-Driven Generation Paper • 2412.08645 • Published Dec 11, 2024 • 11
InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption Paper • 2412.09283 • Published Dec 12, 2024 • 19
Large Action Models: From Inception to Implementation Paper • 2412.10047 • Published Dec 13, 2024 • 32
BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities Paper • 2412.07769 • Published Dec 10, 2024 • 26
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding Paper • 2412.09604 • Published Dec 12, 2024 • 35
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published Dec 13, 2024 • 139
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion Paper • 2412.09626 • Published Dec 12, 2024 • 20