Collections
Discover the best community collections!
Collections including paper arxiv:2405.01434
-
Scalable Diffusion Models with Transformers
Paper • 2212.09748 • Published • 11 -
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
Paper • 2311.15127 • Published • 10 -
Learning Transferable Visual Models From Natural Language Supervision
Paper • 2103.00020 • Published • 9 -
U-Net: Convolutional Networks for Biomedical Image Segmentation
Paper • 1505.04597 • Published • 6
-
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Paper • 2403.16990 • Published • 24 -
ViTAR: Vision Transformer with Any Resolution
Paper • 2403.18361 • Published • 50 -
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Paper • 2404.01197 • Published • 29 -
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Paper • 2404.01367 • Published • 19
-
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Paper • 2404.02905 • Published • 63 -
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback
Paper • 2404.07987 • Published • 47 -
COCONut: Modernizing COCO Segmentation
Paper • 2404.08639 • Published • 26 -
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Paper • 2402.15627 • Published • 33
-
Explorative Inbetweening of Time and Space
Paper • 2403.14611 • Published • 11 -
MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies
Paper • 2403.01422 • Published • 25 -
DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation
Paper • 2402.11929 • Published • 9 -
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Paper • 2403.14773 • Published • 9
-
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation
Paper • 2403.06775 • Published • 3 -
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Paper • 2010.11929 • Published • 6 -
Data Incubation -- Synthesizing Missing Data for Handwriting Recognition
Paper • 2110.07040 • Published • 2 -
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks
Paper • 1811.00056 • Published • 2