3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos Paper • 2403.01444 • Published Mar 3, 2024 • 6
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models Paper • 2403.01807 • Published Mar 4, 2024 • 9
Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation Paper • 2403.02827 • Published Mar 5, 2024 • 8
TripoSR: Fast 3D Object Reconstruction from a Single Image Paper • 2403.02151 • Published Mar 4, 2024 • 14
ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models Paper • 2403.02084 • Published Mar 4, 2024 • 15
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding Paper • 2403.01487 • Published Mar 3, 2024 • 16
MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies Paper • 2403.01422 • Published Mar 3, 2024 • 28
DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models Paper • 2403.00818 • Published Feb 26, 2024 • 19
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on Paper • 2403.01779 • Published Mar 4, 2024 • 30
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis Paper • 2403.03206 • Published Mar 5, 2024 • 63
AtP*: An efficient and scalable method for localizing LLM behaviour to components Paper • 2403.00745 • Published Mar 1, 2024 • 14
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization Paper • 2403.00483 • Published Mar 1, 2024 • 15
Resonance RoPE: Improving Context Length Generalization of Large Language Models Paper • 2403.00071 • Published Feb 29, 2024 • 24
Learning and Leveraging World Models in Visual Representation Learning Paper • 2403.00504 • Published Mar 1, 2024 • 33
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks Paper • 2403.00522 • Published Mar 1, 2024 • 46
ViewFusion: Towards Multi-View Consistency via Interpolated Denoising Paper • 2402.18842 • Published Feb 29, 2024 • 15