Collections
Discover the best community collections!
Collections including paper arxiv:2401.16465
-
ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars
Paper • 2403.15383 • Published • 13 -
FlexiDreamer: Single Image-to-3D Generation with FlexiCubes
Paper • 2404.00987 • Published • 21 -
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Paper • 2402.15627 • Published • 34 -
Interactive3D: Create What You Want by Interactive 3D Generation
Paper • 2404.16510 • Published • 18
-
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization
Paper • 2403.00483 • Published • 12 -
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Paper • 2403.01779 • Published • 27 -
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Paper • 2401.11605 • Published • 21 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48
-
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models
Paper • 2403.01807 • Published • 7 -
TripoSR: Fast 3D Object Reconstruction from a Single Image
Paper • 2403.02151 • Published • 12 -
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Paper • 2403.01779 • Published • 27 -
MagicClay: Sculpting Meshes With Generative Neural Fields
Paper • 2403.02460 • Published • 6
-
Training-Free Consistent Text-to-Image Generation
Paper • 2402.03286 • Published • 64 -
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation
Paper • 2402.04324 • Published • 23 -
λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
Paper • 2402.05195 • Published • 18 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48
-
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 44 -
A Touch, Vision, and Language Dataset for Multimodal Alignment
Paper • 2402.13232 • Published • 13 -
Neural Network Diffusion
Paper • 2402.13144 • Published • 94 -
FlashTex: Fast Relightable Mesh Texturing with LightControlNet
Paper • 2402.13251 • Published • 13
-
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper • 2311.10093 • Published • 57 -
NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation
Paper • 2311.12229 • Published • 26 -
Diffusion Model Alignment Using Direct Preference Optimization
Paper • 2311.12908 • Published • 47 -
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models
Paper • 2312.00845 • Published • 36