# Generative Art Framework Comparison
## Most Popular
### OpenAI Dall-E
URL:
Usage: SaaS (via native API and libraries), small base credits to start with, pay-to-play afterwards
Training Size: 12B/6.5B/3.5B params
Notes:
- Commonly used is v2 which is better and smaller than v1
and its getting smaller and faster in each iteration
- Best of available ones for human images
- Style transfers or model modifiers are charged extra
- **Dall-E** is also licensed to 3rd parties as embedded engine:
Microsoft Designer, etc.
- **Craiyon** as free smaller version (was "Dall-E Mini", but renamed due to copyright)
as original architects did not like commercial direction:
- **OpenAI Glide** is also from OpenAI, frequently ignored in favor of Dall-E, but not far result-wise
### MidJourney
URL:
Usage: SaaS (discord bot or web app) only, free to play with, pay-to-play for commercial usage
Lead: David Holz
Notes:
- Developed by research lab after lead sold his previous startup
- Quickest decent looking results, but little tuning available
- Results are often painting-like regardless of desired style
- Often better 3D-effect than others
### CompVis/Stability.AI/RunwayML Stable Diffusion
URL:
Training size: 1.4B params
Usage: SaaS of offline usage, only fully open-source (**Creative ML OpenRAIL-M** license) to self-run
Notes:
- Originally research project by **CompVis**, continuing under **Stability.AI** entity but still open source
- Training in partnership with **RunwayML**
- Weights distributed via **HuggingFace** (only model with weights available)
- Can be fiddly due to large number of modifiers and tunables, not great for faces out-of-the-box
- Best results when using inpainting and adding of negative prompts
- Version v2 removes styles from plenty authors and reduces tunables
Better photo-realistic results, but prompts require far more complexity to guide it
- Official commercial product via **Stability.AI DreamStudio**
## Promising but not Available
### nVidia eDiff-I
URL:
Usage: Not (yet) publicly available
Training size: 9.1B params
Note:
- Looks very promising, especially with built-in style transfers
- Somewhat different internal architecture with single-pass multi-encoders
### Meta Make-a-Scene
URL:
Training size: 4B params
Usage: Not publicly available
Notes:
- Future is likely meta internal tool until it becomes a filter for IG/FG or something
- Can also generate videos: Make-a-Video
### Google Imagen
URL:
Usage: Not publicly available
Training size: 7.9B params
Notes:
- High-end research from **Google Brain**, not a commercial product
- This is commonly used as a benchmark and reference point to see how good any other product is
- Can also generate videos:
- **Google DreamBooth** looks to separate algorithm to allow to
apply **Imagen** textual inversion techniques to other trained models:
### Google Parti
URL:
Usage: Not publicly available
Training size: 20B params
Notes:
- Different architecture as it does not use diffusion at all
- True *SOTA*, but massively large (10x), better than anything
### Microsoft NUWA Infinity
URL:
Notes:
- Looks impressive, but no idea where its heading