Open Generative AI

AI & ML interests

None defined yet.

Recent Activity

OpenGenAI's activity

sayakpaulΒ 
posted an update 7 days ago
view post
Post
3623
Commits speak louder than words πŸ€ͺ

* 4 new video models
* Multiple image models, including SANA & Flux Control
* New quantizers -> GGUF & TorchAO
* New training scripts

Enjoy this holiday-special Diffusers release πŸ€—
Notes: https://github.com/huggingface/diffusers/releases/tag/v0.32.0
sayakpaulΒ 
posted an update 12 days ago
view post
Post
1643
In the past seven days, the Diffusers team has shipped:

1. Two new video models
2. One new image model
3. Two new quantization backends
4. Three new fine-tuning scripts
5. Multiple fixes and library QoL improvements

Coffee on me if someone can guess 1 - 4 correctly.
  • 1 reply
Β·
sayakpaulΒ 
posted an update 21 days ago
view post
Post
2054
Introducing a high-quality open-preference dataset to further this line of research for image generation.

Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!

So, we decided to work on one with the community!

Check it out here:
https://huggingface.co/blog/image-preferences
  • 7 replies
Β·
sayakpaulΒ 
posted an update 21 days ago
view post
Post
2104
The Control family of Flux from @black-forest-labs should be discussed more!

It enables structural controls like ControlNets while being significantly less expensive to run!

So, we're working on a Control LoRA training script πŸ€—

It's still WIP, so go easy:
https://github.com/huggingface/diffusers/pull/10130
sayakpaulΒ 
posted an update about 1 month ago
sayakpaulΒ 
posted an update about 1 month ago
view post
Post
2612
It's been a while we shipped native quantization support in diffusers 🧨

We currently support bistandbytes as the official backend but using others like torchao is already very simple.

This post is just a reminder of what's possible:

1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4. enable_model_cpu_offload()
5. Training and loading LoRAs into quantized checkpoints

Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes
  • 1 reply
Β·
sayakpaulΒ 
posted an update 3 months ago
view post
Post
2754
Did some little experimentation to resize pre-trained LoRAs on Flux. I explored two themes:

* Decrease the rank of a LoRA
* Increase the rank of a LoRA

The first one is helpful in reducing memory requirements if the LoRA is of a high rank, while the second one is merely an experiment. Another implication of this study is in the unification of LoRA ranks when you would like to torch.compile() them.

Check it out here:
sayakpaul/flux-lora-resizing
  • 1 reply
Β·
sayakpaulΒ 
posted an update 5 months ago
sayakpaulΒ 
posted an update 5 months ago
view post
Post
4480
Flux.1-Dev like images but in fewer steps.

Merging code (very simple), inference code, merged params: sayakpaul/FLUX.1-merged

Enjoy the Monday πŸ€—
Β·
sayakpaulΒ 
posted an update 5 months ago
view post
Post
3795
With larger and larger diffusion transformers coming up, it's becoming increasingly important to have some good quantization tools for them.

We present our findings from a series of experiments on quantizing different diffusion pipelines based on diffusion transformers.

We demonstrate excellent memory savings with a bit of sacrifice on inference latency which is expected to improve in the coming days.

Diffusers 🀝 Quanto ❀️

This was a juicy collaboration between @dacorvo and myself.

Check out the post to learn all about it
https://huggingface.co/blog/quanto-diffusers
Β·
sayakpaulΒ 
posted an update 6 months ago
sayakpaulΒ 
posted an update 6 months ago
view post
Post
3133
What is your favorite part of our Diffusers integration of Stable Diffusion 3?

My personal favorite is the ability to run it on a variety of different GPUs with minimal code changes.

Learn more about them here:
https://huggingface.co/blog/sd3
sayakpaulΒ 
posted an update 7 months ago
view post
Post
1868
🧨 Diffusers 0.28.0 is out πŸ”₯

It features the first non-generative pipeline of the library -- Marigold πŸ₯

Marigold shines at performing Depth Estimation and Surface Normal Estimation. It was contributed by @toshas , one of the authors of Marigold.

This release also features a massive refactor (led by @DN6 ) of the from_single_file() method, highlighting our efforts for making our library more amenable to community features πŸ€—

Check out the release notes here:
https://github.com/huggingface/diffusers/releases/tag/v0.28.0
sayakpaulΒ 
posted an update 8 months ago
view post
Post
2025
Custom pipelines and components in Diffusers 🎸

Wanted to use customized pipelines and other components (schedulers, unets, text encoders, etc.) in Diffusers?

Found it inflexible?

Since the first dawn on earth, we have supported loading custom pipelines via a custom_pipeline argument πŸŒ„

These pipelines are inference-only, i.e., the assumption is that we're leveraging an existing checkpoint (e.g., runwayml/stable-diffusion-v1-5) and ONLY modifying the pipeline implementation.

We have many cool pipelines, implemented that way. They all share the same benefits available to a DiffusionPipeline, no compromise there πŸ€—

Check them here:
https://github.com/huggingface/diffusers/tree/main/examples/community

Then we might have a requirement of everything customized i.e., custom components along with a custom pipeline. Sure, that's all possible.

All you have to do is keep the implementations of those custom components on the Hub repository you're loading your pipeline checkpoint from.

SDXL Japanese was implemented like this πŸ”₯
stabilityai/japanese-stable-diffusion-xl

Full guide is available here ⬇️
https://huggingface.co/docs/diffusers/main/en/using-diffusers/custom_pipeline_overview

And, of course, these share all the benefits that come with DiffusionPipeline.
osansevieroΒ 
posted an update 9 months ago
view post
Post
10205
Diaries of Open Source. Part 15 πŸ€—

πŸ•΅οΈβ€β™€οΈIdefics 2 is out, a multimodal open-source model with very nice capabilities
Models, demo, and datasets: HuggingFaceM4/idefics2-661d1971b7c50831dd3ce0fe
Blog: https://hf.co/blog/idefics2

πŸ’ΎSnowflake released snowflake-arctic-embed, a family of powerful small embedding models
Model: Snowflake/snowflake-arctic-embed-m
Blog: https://www.snowflake.com/blog/introducing-snowflake-arctic-embed-snowflakes-state-of-the-art-text-embedding-family-of-models/

✨Pile-T5, EleutherAI's T5 model trained on 2T tokens
Blog: https://blog.eleuther.ai/pile-t5/
Models: EleutherAI/pile-t5-65a76a0d0022dd270b385a66
GitHub: https://github.com/EleutherAI/improved-t5

πŸ€–CodeQwen1.5-7B base and chat models. Models trained on 3T tokens strong benchmark results for code generation, editing and SQL
Blog post: https://qwenlm.github.io/blog/codeqwen1.5/
Demo: Qwen/CodeQwen1.5-7b-Chat-demo
Models: Qwen/CodeQwen1.5-7B and Qwen/CodeQwen1.5-7B-Chat

Misc
πŸ¦‰ DocOwl1.5: Unified Stucture Learning for OCR-free Document Understanding mPLUG/DocOwl
πŸ‘€Cerule - a tiny Vision LM model Tensoic/Cerule-v0.1
ChemLLM - a LLM for chemistry and molecule science βš—οΈhttps://hf.co/AI4Chem/ChemLLM-7B-Chat-1.5-DPO
Distil Whisper Large
πŸ“New pdf/OCR datasets with 19 samples pixparse/pdf-document-ocr-datasets-660701430b0346f97c4bc628
πŸ”₯Gretel AI high quality text-to-sql synthetic dataset gretelai/synthetic_text_to_sql
Β·
sayakpaulΒ 
posted an update 9 months ago