Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Paper • 2503.15558 • Published 19 days ago • 45
Unleashing Vecset Diffusion Model for Fast Shape Generation Paper • 2503.16302 • Published 17 days ago • 43
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models Paper • 2502.06608 • Published Feb 10 • 40
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation Paper • 2502.13128 • Published Feb 18 • 41
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning Paper • 2501.03226 • Published Jan 6 • 45
Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction Paper • 2501.03218 • Published Jan 6 • 37
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations Paper • 2412.12083 • Published Dec 16, 2024 • 12
FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models Paper • 2412.07674 • Published Dec 10, 2024 • 20
Imagine360: Immersive 360 Video Generation from Perspective Anchor Paper • 2412.03552 • Published Dec 4, 2024 • 29
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images Paper • 2411.05738 • Published Nov 8, 2024 • 15
Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings Paper • 2411.08017 • Published Nov 12, 2024 • 11
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction Paper • 2410.17247 • Published Oct 22, 2024 • 47
Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation Paper • 2409.03718 • Published Sep 5, 2024 • 27
VEnhancer: Generative Space-Time Enhancement for Video Generation Paper • 2407.07667 • Published Jul 10, 2024 • 15
MaPa: Text-driven Photorealistic Material Painting for 3D Shapes Paper • 2404.17569 • Published Apr 26, 2024 • 13
CAT3D: Create Anything in 3D with Multi-View Diffusion Models Paper • 2405.10314 • Published May 16, 2024 • 48
CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model Paper • 2403.05034 • Published Mar 8, 2024 • 22