DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Paper • 2412.07589 • Published Dec 10, 2024 • 47
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper • 2412.08443 • Published Dec 11, 2024 • 38
DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing Paper • 2312.07409 • Published Dec 12, 2023 • 23
FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition Paper • 2312.07536 • Published Dec 12, 2023 • 20
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs Paper • 2307.08581 • Published Jul 17, 2023 • 28
JourneyDB: A Benchmark for Generative Image Understanding Paper • 2307.00716 • Published Jul 3, 2023 • 19
ChessGPT: Bridging Policy Learning and Language Modeling Paper • 2306.09200 • Published Jun 15, 2023 • 9