Submitted by akhaliq 52 Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model · 6 authors 3
Submitted by akhaliq 17 SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding · 8 authors 1
Submitted by akhaliq 13 VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models · 7 authors 2
Submitted by akhaliq 12 DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference · 11 authors 2
Submitted by akhaliq 10 SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers · 6 authors 1
Submitted by akhaliq 8 TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion · 11 authors 1
Submitted by akhaliq 7 Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis · 5 authors 2
Submitted by akhaliq 5 ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization · 6 authors 1