ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving Paper • 2404.16771 • Published 14 days ago • 16
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings Paper • 2404.16820 • Published 14 days ago • 15
MeshLRM: Large Reconstruction Model for High-Quality Mesh Paper • 2404.12385 • Published 21 days ago • 23
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback Paper • 2404.07987 • Published 28 days ago • 45
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation Paper • 2404.02733 • Published Apr 3 • 19
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching Paper • 2404.03653 • Published Apr 4 • 28
LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models Paper • 2404.03118 • Published Apr 3 • 17
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens Paper • 2404.03413 • Published Apr 4 • 21
AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks Paper • 2403.14468 • Published Mar 21 • 18
RadSplat: Radiance Field-Informed Gaussian Splatting for Robust Real-Time Rendering with 900+ FPS Paper • 2403.13806 • Published Mar 20 • 18
Learning Generalizable Feature Fields for Mobile Manipulation Paper • 2403.07563 • Published Mar 12 • 6
Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM Paper • 2403.07487 • Published Mar 12 • 11
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on Paper • 2403.01779 • Published Mar 4 • 25
Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation Paper • 2402.10491 • Published Feb 16 • 15
Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding Paper • 2401.15708 • Published Jan 28 • 9
PALP: Prompt Aligned Personalization of Text-to-Image Models Paper • 2401.06105 • Published Jan 11 • 46
Pearl: A Production-ready Reinforcement Learning Agent Paper • 2312.03814 • Published Dec 6, 2023 • 14
My Projects Collection Projects I've worked on (includes collabs) • 17 items • Updated 10 days ago • 5
DiffiT: Diffusion Vision Transformers for Image Generation Paper • 2312.02139 • Published Dec 4, 2023 • 13
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models Paper • 2312.02949 • Published Dec 5, 2023 • 8
Generating Fine-Grained Human Motions Using ChatGPT-Refined Descriptions Paper • 2312.02772 • Published Dec 5, 2023 • 5
Alchemist: Parametric Control of Material Properties with Diffusion Models Paper • 2312.02970 • Published Dec 5, 2023 • 7
Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models Paper • 2312.02969 • Published Dec 5, 2023 • 11
Fine-grained Controllable Video Generation via Object Appearance and Context Paper • 2312.02919 • Published Dec 5, 2023 • 9
LivePhoto: Real Image Animation with Text-guided Motion Control Paper • 2312.02928 • Published Dec 5, 2023 • 15
Orthogonal Adaptation for Modular Customization of Diffusion Models Paper • 2312.02432 • Published Dec 5, 2023 • 12
Analyzing and Improving the Training Dynamics of Diffusion Models Paper • 2312.02696 • Published Dec 5, 2023 • 31
Describing Differences in Image Sets with Natural Language Paper • 2312.02974 • Published Dec 5, 2023 • 12
X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model Paper • 2312.02238 • Published Dec 4, 2023 • 24
PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics Paper • 2311.12198 • Published Nov 20, 2023 • 19
LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching Paper • 2311.11284 • Published Nov 19, 2023 • 16
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module Paper • 2311.05556 • Published Nov 9, 2023 • 73
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior Paper • 2310.16818 • Published Oct 25, 2023 • 27
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models Paper • 2309.05793 • Published Sep 11, 2023 • 50
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion Paper • 2310.03502 • Published Oct 5, 2023 • 74
Stable Code Collection Suite of developer assistant models • 5 items • Updated about 1 month ago • 33
CodePlan: Repository-level Coding using LLMs and Planning Paper • 2309.12499 • Published Sep 21, 2023 • 68