-
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 74 -
Scalable Diffusion Models with Transformers
Paper • 2212.09748 • Published • 8 -
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
Paper • 2311.15127 • Published • 6 -
Learning Transferable Visual Models From Natural Language Supervision
Paper • 2103.00020 • Published • 7
Collections
Discover the best community collections!
Collections including paper arxiv:2310.03502
-
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 74 -
GLIGEN: Open-Set Grounded Text-to-Image Generation
Paper • 2301.07093 • Published • 3 -
Music Consistency Models
Paper • 2404.13358 • Published • 12 -
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Paper • 2404.14507 • Published • 21
-
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 74 -
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Paper • 2404.07448 • Published • 10 -
RegionGPT: Towards Region Understanding Vision Language Model
Paper • 2403.02330 • Published • 2 -
GLIGEN: Open-Set Grounded Text-to-Image Generation
Paper • 2301.07093 • Published • 3
-
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 74 -
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Paper • 2404.07448 • Published • 10 -
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models
Paper • 2404.07973 • Published • 28 -
COCONut: Modernizing COCO Segmentation
Paper • 2404.08639 • Published • 25
-
Demystifying CLIP Data
Paper • 2309.16671 • Published • 17 -
Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 9 -
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Paper • 2404.01367 • Published • 19 -
On the Scalability of Diffusion-based Text-to-Image Generation
Paper • 2404.02883 • Published • 17
-
U-Net: Convolutional Networks for Biomedical Image Segmentation
Paper • 1505.04597 • Published • 5 -
Image Segmentation using U-Net Architecture for Powder X-ray Diffraction Images
Paper • 2310.16186 • Published • 2 -
H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes
Paper • 1709.07330 • Published • 2 -
Deep LOGISMOS: Deep Learning Graph-based 3D Segmentation of Pancreatic Tumors on CT scans
Paper • 1801.08599 • Published • 2
-
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation
Paper • 2403.06775 • Published • 3 -
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Paper • 2010.11929 • Published • 5 -
Data Incubation -- Synthesizing Missing Data for Handwriting Recognition
Paper • 2110.07040 • Published • 2 -
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks
Paper • 1811.00056 • Published • 2