-
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era
Paper • 2305.06131 • Published • 2 -
Perpetual Humanoid Control for Real-time Simulated Avatars
Paper • 2305.06456 • Published • 1 -
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
Paper • 2305.10973 • Published • 32 -
LDM3D: Latent Diffusion Model for 3D
Paper • 2305.10853 • Published • 10
Collections
Discover the best community collections!
Collections including paper arxiv:2310.15111
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 24 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 113 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 72 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 32
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 40 -
Data Filtering Networks
Paper • 2309.17425 • Published • 6 -
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper • 2311.01282 • Published • 35 -
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Paper • 2311.00945 • Published • 14
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 40 -
De-Diffusion Makes Text a Strong Cross-Modal Interface
Paper • 2311.00618 • Published • 21 -
MM-VID: Advancing Video Understanding with GPT-4V(ision)
Paper • 2310.19773 • Published • 19 -
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Paper • 2310.15308 • Published • 22
-
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation
Paper • 2310.16656 • Published • 40 -
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images
Paper • 2310.16825 • Published • 32 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 40 -
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Paper • 2311.04145 • Published • 32
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 40 -
AToM: Amortized Text-to-Mesh using 2D Diffusion
Paper • 2402.00867 • Published • 10 -
Neural Network Diffusion
Paper • 2402.13144 • Published • 94 -
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Paper • 2402.19479 • Published • 32
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 40 -
SortedNet, a Place for Every Network and Every Network in its Place: Towards a Generalized Solution for Training Many-in-One Neural Networks
Paper • 2309.00255 • Published • 1 -
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper • 2309.08968 • Published • 22 -
Matryoshka Representation Learning
Paper • 2205.13147 • Published • 9
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 14 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 25 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 6 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20