Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders Paper • 2412.09586 • Published 1 day ago • 5 • 2
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation Paper • 2412.09349 • Published 1 day ago • 5 • 2
SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts Paper • 2412.05552 • Published 7 days ago • 2 • 2
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition Paper • 2412.09501 • Published 1 day ago • 8 • 2
JuStRank: Benchmarking LLM Judges for System Ranking Paper • 2412.09569 • Published 1 day ago • 9 • 3
LoRACLR: Contrastive Adaptation for Customization of Diffusion Models Paper • 2412.09622 • Published 1 day ago • 4 • 2
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published 1 day ago • 59 • 2
Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion Paper • 2412.09593 • Published 1 day ago • 13 • 4
Word Sense Linking: Disambiguating Outside the Sandbox Paper • 2412.09370 • Published 1 day ago • 4 • 2
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials Paper • 2412.09605 • Published 1 day ago • 15 • 2
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Paper • 2412.09618 • Published 1 day ago • 17 • 3
RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios Paper • 2412.08972 • Published 2 days ago • 7 • 2
Shiksha: A Technical Domain focused Translation Dataset and Model for Indian Languages Paper • 2412.09025 • Published 2 days ago • 2 • 2
Arbitrary-steps Image Super-resolution via Diffusion Inversion Paper • 2412.09013 • Published 2 days ago • 3 • 2
PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations Paper • 2412.05994 • Published 5 days ago • 10 • 2
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published 2 days ago • 29 • 2
The Impact of Copyrighted Material on Large Language Models: A Norwegian Perspective Paper • 2412.09460 • Published 1 day ago • 3 • 2
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction Paper • 2412.09573 • Published 1 day ago • 3 • 3