MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data Paper • 2406.18790 • Published 19 days ago • 32
4M Models Collection Multimodal models from https://4m.epfl.ch/ • 14 items • Updated about 1 month ago • 29
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering Paper • 2403.09622 • Published Mar 14 • 15
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated about 20 hours ago • 150
An Image is Worth 32 Tokens for Reconstruction and Generation Paper • 2406.07550 • Published Jun 11 • 53
The Prompt Report: A Systematic Survey of Prompting Techniques Paper • 2406.06608 • Published Jun 6 • 47
Stable Diffusion 3 Collection Stable Diffusion 3 and related models for text-to-image and image-to-image • 2 items • Updated Jun 12 • 72
Concept Decomposition for Visual Exploration and Inspiration Paper • 2305.18203 • Published May 29, 2023 • 2
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model Paper • 2406.04333 • Published Jun 6 • 36
Flash Diffusion Collection Collection of models distilled using the method proposed in Flash Diffusion paper • 7 items • Updated 27 days ago • 13
view article Article Train custom AI models with the trainer API and adapt them to 🤗 By not-lain • 16 days ago • 29
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation Paper • 2404.05674 • Published Apr 8 • 12
sentence-transformers-from-synthetic-data Collection Example of using distilabel to generate synthetic triplets data for fine-tuning a Sentence Transformer model • 4 items • Updated 24 days ago • 20
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis Paper • 2312.17681 • Published Dec 29, 2023 • 17
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models Paper • 2405.16537 • Published May 26 • 15
CameraCtrl: Enabling Camera Control for Text-to-Video Generation Paper • 2404.02101 • Published Apr 2 • 20
Looking Backward: Streaming Video-to-Video Translation with Feature Banks Paper • 2405.15757 • Published May 24 • 14
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images Paper • 2310.16825 • Published Oct 25, 2023 • 30
view article Article Enjoy the Power of Phi-3 with ONNX Runtime on your device By Emma-N • May 22 • 22
ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing Paper • 2404.04376 • Published Apr 5 • 1
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated 18 days ago • 123
view article Article Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task By danaaubakirova • May 16 • 15
view article Article Synthetic dataset generation techniques: Self-Instruct By davanstrien • May 15 • 5
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Paper • 2405.01434 • Published May 2 • 50
RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion Paper • 2404.07199 • Published Apr 10 • 22
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Paper • 2404.19427 • Published Apr 30 • 71
Edit Your Image! Collection Find all the trending and useful Gradio demos that you can use to edit your images. • 21 items • Updated Apr 26 • 23
FABLES: Evaluating faithfulness and content selection in book-length summarization Paper • 2404.01261 • Published Apr 1 • 3
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis Paper • 2404.13686 • Published Apr 21 • 26
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare Apr 19 • 86
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Apr 22 • 75
Factorized Diffusion: Perceptual Illusions by Noise Decomposition Paper • 2404.11615 • Published Apr 17 • 2
[lecture artifacts] aligning open language models Collection artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated Apr 17 • 51
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback Paper • 2404.07987 • Published Apr 11 • 46
HF-curated models available on Workers AI Collection A collection of models curated with Hugging Face that can be run on Cloudflare's Workers AI serverless inference platform. • 15 items • Updated Apr 2 • 50
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions Paper • 2403.16627 • Published Mar 25 • 20
Transparent Image Layer Diffusion using Latent Transparency Paper • 2402.17113 • Published Feb 27 • 5
LayerDiffusion: Layered Controlled Image Editing with Diffusion Models Paper • 2305.18676 • Published May 30, 2023 • 1