-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 23 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 111 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 71 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 30
Collections
Discover the best community collections!
Collections including paper arxiv:2310.03502
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 39 -
De-Diffusion Makes Text a Strong Cross-Modal Interface
Paper • 2311.00618 • Published • 21 -
MM-VID: Advancing Video Understanding with GPT-4V(ision)
Paper • 2310.19773 • Published • 18 -
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Paper • 2310.15308 • Published • 22
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 94 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 63 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 39 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 39
-
BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Paper • 2305.14720 • Published • 2 -
GlyphControl: Glyph Conditional Control for Visual Text Generation
Paper • 2305.18259 • Published • 1 -
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper • 2309.05793 • Published • 50 -
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 74
-
OmnimatteRF: Robust Omnimatte with 3D Background Modeling
Paper • 2309.07749 • Published • 6 -
AudioSR: Versatile Audio Super-resolution at Scale
Paper • 2309.07314 • Published • 23 -
Generative Image Dynamics
Paper • 2309.07906 • Published • 51 -
MagiCapture: High-Resolution Multi-Concept Portrait Customization
Paper • 2309.06895 • Published • 27