Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.01952

Visual Instruction Tuning

Paper • 2304.08485 • Published Apr 17, 2023 • 13
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 50
Improved Baselines with Visual Instruction Tuning

Paper • 2310.03744 • Published Oct 5, 2023 • 37
Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 30

Multi-Modal LLM

Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3, 2024 • 31

Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3, 2024 • 31

LLM Augmented LLMs: Expanding Capabilities through Composition

Paper • 2401.02412 • Published Jan 4, 2024 • 37
Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3, 2024 • 31

Image generation

Boundary Attention: Learning to Find Faint Boundaries at Any Resolution

Paper • 2401.00935 • Published Jan 1, 2024 • 18
Taming Mode Collapse in Score Distillation for Text-to-3D Generation

Paper • 2401.00909 • Published Dec 31, 2023 • 10
Q-Refine: A Perceptual Quality Refiner for AI-Generated Image

Paper • 2401.01117 • Published Jan 2, 2024 • 10
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data

Paper • 2401.01173 • Published Jan 2, 2024 • 12

Diffusion Models

Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3, 2024 • 31
ODIN: A Single Model for 2D and 3D Perception

Paper • 2401.02416 • Published Jan 4, 2024 • 13
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

Paper • 2404.01367 • Published Apr 1, 2024 • 22
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models

Paper • 2404.02747 • Published Apr 3, 2024 • 13

Image Synthesis

Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3, 2024 • 31

Images unlimited

Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3, 2024 • 31

StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation

Paper • 2312.12491 • Published Dec 19, 2023 • 70
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

Paper • 2401.11708 • Published Jan 22, 2024 • 30
Training-Free Consistent Text-to-Image Generation

Paper • 2402.03286 • Published Feb 5, 2024 • 67
PALP: Prompt Aligned Personalization of Text-to-Image Models

Paper • 2401.06105 • Published Jan 11, 2024 • 49

CV / Text-to-Image / Image-to-Image / Diffusion

https://huggingface.co/collections/merve/

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

Paper • 2208.12242 • Published Aug 25, 2022 • 11
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models

Paper • 2308.06721 • Published Aug 13, 2023 • 30
h94/IP-Adapter-FaceID

Text-to-Image • Updated Apr 16, 2024 • 408k • 1.67k
PALP: Prompt Aligned Personalization of Text-to-Image Models

Paper • 2401.06105 • Published Jan 11, 2024 • 49

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs