Flowing from Words to Pixels: A Framework for Cross-Modality Evolution Paper • 2412.15213 • Published 14 days ago • 25
Structured 3D Latents for Scalable and Versatile 3D Generation Paper • 2412.01506 • Published Dec 2, 2024 • 50
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints Paper • 2412.07760 • Published 23 days ago • 50
SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis Paper • 2411.16443 • Published Nov 25, 2024 • 9
Material Anything: Generating Materials for Any 3D Object via Diffusion Paper • 2411.15138 • Published Nov 22, 2024 • 42
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11, 2024 • 34
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7, 2024 • 111
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published Nov 4, 2024 • 46
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders Paper • 2410.22366 • Published Oct 28, 2024 • 77
MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms Paper • 2410.18977 • Published Oct 24, 2024 • 14
Unbounded: A Generative Infinite Game of Character Life Simulation Paper • 2410.18975 • Published Oct 24, 2024 • 35
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation Paper • 2410.13232 • Published Oct 17, 2024 • 41
Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality Paper • 2410.05210 • Published Oct 7, 2024 • 10