diff --git "a/CHANGELOG.md" "b/CHANGELOG.md" new file mode 100644--- /dev/null +++ "b/CHANGELOG.md" @@ -0,0 +1,2237 @@ +# Change Log for SD.Next + +## TODO + +- reference styles +- quick apply style + +## Update for 2024-03-19 + +### Highlights 2024-03-19 + +New models: +- [Stable Cascade](https://github.com/Stability-AI/StableCascade) *Full* and *Lite* +- [Playground v2.5](https://huggingface.co/playgroundai/playground-v2.5-1024px-aesthetic) +- [KOALA 700M](https://github.com/youngwanLEE/sdxl-koala) +- [Stable Video Diffusion XT 1.1](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt-1-1) +- [VGen](https://huggingface.co/ali-vilab/i2vgen-xl) + +New pipelines and features: +- Img2img using [LEdit++](https://leditsplusplus-project.static.hf.space/index.html), context aware method with image analysis and positive/negative prompt handling +- Trajectory Consistency Distillation [TCD](https://mhh0318.github.io/tcd) for processing in even less steps +- Visual Query & Answer using [moondream2](https://github.com/vikhyat/moondream) as an addition to standard interrogate methods +- **Face-HiRes**: simple built-in detailer for face refinements +- Even simpler outpaint: when resizing image, simply pick outpaint method and if image has different aspect ratio, blank areas will be outpainted! +- UI aspect-ratio controls and other UI improvements +- User controllable invisibile and visible watermarking +- Native composable LoRA + +What else? + +- **Reference models**: *Networks -> Models -> Reference*: All reference models now come with recommended settings that can be auto-applied if desired +- **Styles**: Not just for prompts! Styles can apply *generate parameters* as templates and can be used to *apply wildcards* to prompts +improvements, Additional API endpoints +- Given the high interest in [ZLUDA](https://github.com/vosen/ZLUDA) engine introduced in last release we've updated much more flexible/automatic install procedure (see [wiki](https://github.com/vladmandic/automatic/wiki/ZLUDA) for details) +- Plus Additional Improvements such as: Smooth tiling, Refine/HiRes workflow improvements, Control workflow + +Further details: +- For basic instructions, see [README](https://github.com/vladmandic/automatic/blob/master/README.md) +- For more details on all new features see full [CHANGELOG](https://github.com/vladmandic/automatic/blob/master/CHANGELOG.md) +- For documentation, see [WiKi](https://github.com/vladmandic/automatic/wiki) +- [Discord](https://discord.com/invite/sd-next-federal-batch-inspectors-1101998836328697867) server + +### Full Changelog 2024-03-19 + +- [Stable Cascade](https://github.com/Stability-AI/StableCascade) *Full* and *Lite* + - large multi-stage high-quality model from warp-ai/wuerstchen team and released by stabilityai + - download using networks -> reference + - see [wiki](https://github.com/vladmandic/automatic/wiki/Stable-Cascade) for details +- [Playground v2.5](https://huggingface.co/playgroundai/playground-v2.5-1024px-aesthetic) + - new model version from Playground: based on SDXL, but with some cool new concepts + - download using networks -> reference + - set sampler to *DPM++ 2M EDM* or *Euler EDM* +- [KOALA 700M](https://github.com/youngwanLEE/sdxl-koala) + - another very fast & light sdxl model where original unet was compressed and distilled to 54% of original size + - download using networks -> reference + - *note* to download fp16 variant (recommended), set settings -> diffusers -> preferred model variant +- [LEdit++](https://leditsplusplus-project.static.hf.space/index.html) + - context aware img2img method with image analysis and positive/negative prompt handling + - enable via img2img -> scripts -> ledit + - uses following params from standard img2img: cfg scale (recommended ~3), steps (recommended ~50), denoise strength (recommended ~0.7) + - can use postive and/or negative prompt to guide editing process + - positive prompt: what to enhance, strength and threshold for auto-masking + - negative prompt: what to remove, strength and threshold for auto-masking + - *note*: not compatible with model offloading +- **Second Pass / Refine** + - independent upscale and hires options: run hires without upscale or upscale without hires or both + - upscale can now run 0.1-8.0 scale and will also run if enabled at 1.0 to allow for upscalers that simply improve image quality + - update ui section to reflect changes + - *note*: behavior using backend:original is unchanged for backwards compatibilty +- **Visual Query** visual query & answer in process tab + - go to process -> visual query + - ask your questions, e.g. "describe the image", "what is behind the subject", "what are predominant colors of the image?" + - primary model is [moondream2](https://github.com/vikhyat/moondream), a *tiny* 1.86B vision language model + *note*: its still 3.7GB in size, so not really tiny + - additional support for multiple variations of several base models: *GIT, BLIP, ViLT, PIX*, sizes range from 0.3 to 1.7GB +- **Video** + - **Image2Video** + - new module for creating videos from images + - simply enable from *img2img -> scripts -> image2video* + - model is auto-downloaded on first use + - based on [VGen](https://huggingface.co/ali-vilab/i2vgen-xl) + - **Stable Video Diffusion** + - updated with *SVD 1.0, SVD XT 1.0 and SVD XT 1.1* + - models are auto-downloaded on first use + - simply enable from *img2img -> scripts -> stable video diffusion* + - for svd 1.0, use frames=~14, for xt models use frames=~25 +- **Composable LoRA**, thanks @AI-Casanova + - control lora strength for each step + for example: `` means strength=0.1 for step at 0% and intepolate towards strength=0.9 for step at 100% + - *note*: this is a very experimental feature and may not work as expected +- **Control** + - added *refiner/hires* workflows + - added resize methods to before/after/mask: fixed, crop, fill +- **Styles**: styles are not just for prompts! + - new styles editor: *networks -> styles -> edit* + - styles can apply generate parameters, for example to have a style that enables and configures hires: + parameters=`enable_hr: True, hr_scale: 2, hr_upscaler: Latent Bilinear antialias, hr_sampler_name: DEIS, hr_second_pass_steps: 20, denoising_strength: 0.5` + - styles can apply wildcards to prompts, for example: + wildcards=`movie=mad max, dune, star wars, star trek; intricate=realistic, color sketch, pencil sketch, intricate` + - as usual, you can apply any number of styles so you can choose which settings are applied and in which order and which wildcards are used +- **UI** + - *aspect-ratio** add selector and lock to width/height control + allowed aspect ration can be configured via *settings -> user interface* + - *interrogate* tab is now merged into *process* tab + - *image viewer* now displays image metadata + - *themes* improve on-the-fly switching + - *log monitor* flag server warnings/errors and overall improve display + - *control* separate processor settings from unit settings +- **Face HiRes** + - new *face restore* option, works similar to well-known *adetailer* by running an inpaint on detected faces but with just a checkbox to enable/disable + - set as default face restorer in settings -> postprocessing + - disabled by default, to enable simply check *face restore* in your generate advanced settings + - strength, steps and sampler are set using by hires section in refine menu + - strength can be overriden in settings -> postprocessing + - will use secondary prompt and secondary negative prompt if present in refine +- **Watermarking** + - SD.Next disables all known watermarks in models, but does allow user to set custom watermark + - see *settings -> image options -> watermarking* + - invisible watermark: using steganogephy + - image watermark: overlaid on top of image +- **Reference models** + - additional reference models available for single-click download & run: + *Stable Cascade, Stable Cascade lite, Stable Video Diffusion XT 1.1* + - reference models will now download *fp16* variation by default + - reference models will print recommended settings to log if present + - new setting in extra network: *use reference values when available* + disabled by default, if enabled will force use of reference settings for models that have them +- **Samplers** + - [TCD](https://mhh0318.github.io/tcd/): Trajectory Consistency Distillation + new sampler that produces consistent results in a very low number of steps (comparable to LCM but without reliance on LoRA) + for best results, use with TCD LoRA: + - *DPM++ 2M EDM* and *Euler EDM* + EDM is a new solver algorithm currently available for DPM++2M and Euler samplers + Note that using EDM samplers with non-EDM optimized models will provide just noise and vice-versa +- **Improvements** + - **FaceID** extend support for LoRA, HyperTile and FreeU, thanks @Trojaner + - **Tiling** now extends to both Unet and VAE producing smoother outputs, thanks @AI-Casanova + - new setting in image options: *include mask in output* + - improved params parsing from from prompt string and styles + - default theme updates and additional built-in theme *black-gray* + - support models with their own YAML model config files + - support models with their own JSON per-component config files, for example: `playground-v2.5_vae.config` + - prompt can have comments enclosed with `/*` and `*/` + comments are extracted from prompt and added to image metadata +- **ROCm** + - add **ROCm** 6.0 nightly option to installer, thanks @jicka + - add *flash attention* support for rdna3, thanks @Disty0 + install flash_attn package for rdna3 manually and enable *flash attention* from *compute settings* + to install flash_attn, activate the venv and run `pip install -U git+https://github.com/ROCm/flash-attention@howiejay/navi_support` +- **IPEX** + - disabled IPEX Optimize by default +- **API** + - add preprocessor api endpoints + GET:`/sdapi/v1/preprocessors`, POST:`/sdapi/v1/preprocess`, sample script:`cli/simple-preprocess.py` + - add masking api endpoints + GET:`/sdapi/v1/masking`, POST:`/sdapi/v1/mask`, sample script:`cli/simple-mask.py` +- **Internal** + - improved vram efficiency for model compile, thanks @Disty0 + - **stable-fast** compatibility with torch 2.2.1 + - remove obsolete textual inversion training code + - remove obsolete hypernetworks training code +- **Refiner** validated workflows: + - Fully functional: SD15 + SD15, SDXL + SDXL, SDXL + SDXL-R + - Functional, but result is not as good: SD15 + SDXL, SDXL + SD15, SD15 + SDXL-R +- **SDXL Lightning** models just-work, just makes sure to set CFG Scale to 0 + and choose a best-suited sampler, it may not be the one you're used to (e.g. maybe even basic Euler) +- **Fixes** + - improve *model cpu offload* compatibility + - improve *model sequential offload* compatibility + - improve *bfloat16* compatibility + - improve *xformers* installer to match cuda version and install triton + - fix extra networks refresh + - fix *sdp memory attention* in backend original + - fix autodetect sd21 models + - fix api info endpoint + - fix *sampler eta* in xyz grid, thanks @AI-Casanova + - fix *requires_aesthetics_score* errors + - fix t2i-canny + - fix *differenital diffusion* for manual mask, thanks @23pennies + - fix ipadapter apply/unapply on batch runs + - fix control with multiple units and override images + - fix control with hires + - fix control-lllite + - fix font fallback, thanks @NetroScript + - update civitai downloader to handler new metadata + - improve control error handling + - use default model variant if specified variant doesnt exist + - use diffusers lora load override for *lcm/tcd/turbo loras* + - exception handler around vram memory stats gather + - improve ZLUDA installer with `--use-zluda` cli param, thanks @lshqqytiger + +## Update for 2024-02-22 + +Only 3 weeks since last release, but here's another feature-packed one! +This time release schedule was shorter as we wanted to get some of the fixes out faster. + +### Highlights 2024-02-22 + +- **IP-Adapters** & **FaceID**: multi-adapter and multi-image suport +- New optimization engines: [DeepCache](https://github.com/horseee/DeepCache), [ZLUDA](https://github.com/vosen/ZLUDA) and **Dynamic Attention Slicing** +- New built-in pipelines: [Differential diffusion](https://github.com/exx8/differential-diffusion) and [Regional prompting](https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#regional-prompting-pipeline) +- Big updates to: **Outpainting** (noised-edge-extend), **Clip-skip** (interpolate with non-integrer values!), **CFG end** (prevent overburn on high CFG scales), **Control** module masking functionality +- All reported issues since the last release are addressed and included in this release + +Further details: +- For basic instructions, see [README](https://github.com/vladmandic/automatic/blob/master/README.md) +- For more details on all new features see full [CHANGELOG](https://github.com/vladmandic/automatic/blob/master/CHANGELOG.md) +- For documentation, see [WiKi](https://github.com/vladmandic/automatic/wiki) +- [Discord](https://discord.com/invite/sd-next-federal-batch-inspectors-1101998836328697867) server + +### Full ChangeLog for 2024-02-22 + +- **Improvements**: + - **IP Adapter** major refactor + - support for **multiple input images** per each ip adapter + - support for **multiple concurrent ip adapters** + *note*: you cannot mix & match ip adapters that use different *CLiP* models, for example `Base` and `Base ViT-G` + - add **adapter start/end** to settings, thanks @AI-Casanova + having adapter start late can help with better control over composition and prompt adherence + having adapter end early can help with overal quality and performance + - unified interface in txt2img, img2img and control + - enhanced xyz grid support + - **FaceID** now also works with multiple input images! + - [Differential diffusion](https://github.com/exx8/differential-diffusion) + img2img generation where you control strength of each pixel or image area + can be used with manually created masks or with auto-generated depth-maps + uses general denoising strength value + simply enable from *img2img -> scripts -> differential diffusion* + *note*: supports sd15 and sdxl models + - [Regional prompting](https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#regional-prompting-pipeline) as a built-in solution + usage is same as original implementation from @hako-mikan + click on title to open docs and see examples of full syntax on how to use it + simply enable from *scripts -> regional prompting* + *note*: supports sd15 models only + - [DeepCache](https://github.com/horseee/DeepCache) model acceleration + it can produce massive speedups (2x-5x) with no overhead, but with some loss of quality + *settings -> compute -> model compile -> deep-cache* and *settings -> compute -> model compile -> cache interval* + - [ZLUDA](https://github.com/vosen/ZLUDA) experimental support, thanks @lshqqytiger + - ZLUDA is CUDA wrapper that can be used for GPUs without native support + - best use case is *AMD GPUs on Windows*, see [wiki](https://github.com/vladmandic/automatic/wiki/ZLUDA) for details + - **Outpaint** control outpaint now uses new alghorithm: noised-edge-extend + new method allows for much larger outpaint areas in a single pass, even outpaint 512->1024 works well + note that denoise strength should be increased for larger the outpaint areas, for example outpainting 512->1024 works well with denoise 0.75 + outpaint can run in *img2img* mode (default) and *inpaint* mode where original image is masked (if inpaint masked only is selected) + - **Clip-skip** reworked completely, thanks @AI-Casanova & @Disty0 + now clip-skip range is 0-12 where previously lowest value was 1 (default is still 1) + values can also be decimal to interpolate between different layers, for example `clip-skip: 1.5`, thanks @AI-Casanova + - **CFG End** new param to control image generation guidance, thanks @AI-Casanova + sometimes you want strong control over composition, but you want it to stop at some point + for example, when used with ip-adapters or controlnet, high cfg scale can overpower the guided image + - **Control** + - when performing inpainting, you can specify processing resolution using **size->mask** + - units now have extra option to re-use current preview image as processor input + - **Cross-attention** refactored cross-attention methods, thanks @Disty0 + - for backend:original, its unchanged: SDP, xFormers, Doggettxs, InvokeAI, Sub-quadratic, Split attention + - for backend:diffuers, list is now: SDP, xFormers, Batch matrix-matrix, Split attention, Dynamic Attention BMM, Dynamic Attention SDP + note: you may need to update your settings! Attention Slicing is renamed to Split attention + - for ROCm, updated default cross-attention to Scaled Dot Product + - **Dynamic Attention Slicing**, thanks @Disty0 + - dynamically slices attention queries in order to keep them under the slice rate + slicing gets only triggered if the query size is larger than the slice rate to gain performance + *Dynamic Attention Slicing BMM* uses *Batch matrix-matrix* + *Dynamic Attention Slicing SDP* uses *Scaled Dot Product* + - *settings -> compute settings -> attention -> dynamic attention slicing* + - **ONNX**: + - allow specify onnx default provider and cpu fallback + *settings -> diffusers* + - allow manual install of specific onnx flavor + *settings -> onnx* + - better handling of `fp16` models/vae, thanks @lshqqytiger + - **OpenVINO** update to `torch 2.2.0`, thanks @Disty0 + - **HyperTile** additional options thanks @Disty0 + - add swap size option + - add use only for hires pass option + - add `--theme` cli param to force theme on startup + - add `--allow-paths` cli param to add additional paths that are allowed to be accessed via web, thanks @OuticNZ +- **Wiki**: + - added benchmark notes for IPEX, OpenVINO and Olive + - added ZLUDA wiki page +- **Internal** + - update dependencies + - refactor txt2img/img2img api + - enhanced theme loader + - add additional debug env variables + - enhanced sdp cross-optimization control + see *settings -> compute settings* + - experimental support for *python 3.12* +- **Fixes**: + - add variation seed to diffusers txt2img, thanks @AI-Casanova + - add cmd param `--skip-env` to skip setting of environment parameters during sdnext load + - handle extensions that install conflicting versions of packages + `onnxruntime`, `opencv2-python` + - installer refresh package cache on any install + - fix embeddings registration on server startup, thanks @AI-Casanova + - ipex handle dependencies, thanks @Disty0 + - insightface handle dependencies + - img2img mask blur and padding + - xyz grid handle ip adapter name and scale + - lazy loading of image may prevent metadata from being loaded on time + - allow startup without valid models folder + - fix interrogate api endpoint + - control fix resize causing runtime errors + - control fix processor override image after processor change + - control fix display grid with batch + - control restore pipeline before running scripts/extensions + - handle pipelines that return dict instead of object + - lora use strict name matching if preferred option is by-filename + - fix inpaint mask only for diffusers + - fix vae dtype mismatch, thanks @Disty0 + - fix controlnet inpaint mask + - fix theme list refresh + - fix extensions update information in ui + - fix taesd with bfloat16 + - fix model merge manual merge settings, thanks @AI-Casanova + - fix gradio instant update issues for textboxes in quicksettings + - fix rembg missing dependency + - bind controlnet extension to last known working commit, thanks @Aptronymist + - prompts-from-file fix resizable prompt area + +## Update for 2024-02-07 + +Another big release just hit the shelves! + +### Highlights 2024-02-07 + +- A lot more functionality in the **Control** module: + - Inpaint and outpaint support, flexible resizing options, optional hires + - Built-in support for many new processors and models, all auto-downloaded on first use + - Full support for scripts and extensions +- Complete **Face** module + implements all variations of **FaceID**, **FaceSwap** and latest **PhotoMaker** and **InstantID** +- Much enhanced **IPAdapter** modules +- Brand new **Intelligent masking**, manual or automatic + Using ML models (*LAMA* object removal, *REMBG* background removal, *SAM* segmentation, etc.) and with live previews + With granular blur, erode and dilate controls +- New models and pipelines: + **Segmind SegMoE**, **Mixture Tiling**, **InstaFlow**, **SAG**, **BlipDiffusion** +- Massive work integrating latest advances with [OpenVINO](https://github.com/vladmandic/automatic/wiki/OpenVINO), [IPEX](https://github.com/vladmandic/automatic/wiki/Intel-ARC) and [ONNX Olive](https://github.com/vladmandic/automatic/wiki/ONNX-Runtime-&-Olive) +- Full control over brightness, sharpness and color shifts and color grading during generate process directly in latent space +- **Documentation**! This was a big one, with a lot of new content and updates in the [WiKi](https://github.com/vladmandic/automatic/wiki) + +Plus welcome additions to **UI performance, usability and accessibility** and flexibility of deployment as well as **API** improvements +And it also includes fixes for all reported issues so far + +As of this release, default backend is set to **diffusers** as its more feature rich than **original** and supports many additional models (original backend does remain as fully supported) + +Also, previous versions of **SD.Next** were tuned for balance between performance and resource usage. +With this release, focus is more on performance. +See [Benchmark](https://github.com/vladmandic/automatic/wiki/Benchmark) notes for details, but as a highlight, we are now hitting **~110-150 it/s** on a standard nVidia RTX4090 in optimal scenarios! + +Further details: +- For basic instructions, see [README](https://github.com/vladmandic/automatic/blob/master/README.md) +- For more details on all new features see full [CHANGELOG](https://github.com/vladmandic/automatic/blob/master/CHANGELOG.md) +- For documentation, see [WiKi](https://github.com/vladmandic/automatic/wiki) + +### Full ChangeLog 2024-02-07 + +- Heavily updated [Wiki](https://github.com/vladmandic/automatic/wiki) +- **Control**: + - new docs: + - [Control overview](https://github.com/vladmandic/automatic/wiki/Control) + - [Control guide](https://github.com/vladmandic/automatic/wiki/Control-Guide), thanks @Aptronymist + - add **inpaint** support + applies to both *img2img* and *controlnet* workflows + - add **outpaint** support + applies to both *img2img* and *controlnet* workflows + *note*: increase denoising strength since outpainted area is blank by default + - new **mask** module + - granular blur (gaussian), erode (reduce or remove noise) and dilate (pad or expand) + - optional **live preview** + - optional **auto-segmentation** using ml models + auto-segmentation can be done using **segment-anything** models or **rembg** models + *note*: auto segmentation will automatically expand user-masked area to segments that include current user mask + - optional **auto-mask** + if you dont provide mask or mask is empty, you can instead use auto-mask to automatically generate mask + this is especially useful if you want to use advanced masking on batch or video inputs and dont want to manually mask each image + *note*: such auto-created mask is also subject to all other selected settings such as auto-segmentation, blur, erode and dilate + - optional **object removal** using LaMA model + remove selected objects from images with a single click + works best when combined with auto-segmentation to remove smaller objects + - masking can be combined with control processors in which case mask is applied before processor + - unmasked part of can is optionally applied to final image as overlay, see settings `mask_apply_overlay` + - support for many additional controlnet models + now built-in models include 30+ SD15 models and 15+ SDXL models + - allow **resize** both *before* and *after* generate operation + this allows for workflows such as: *image -> upscale or downscale -> generate -> upscale or downscale -> output* + providing more flexibility and than standard hires workflow + *note*: resizing before generate can be done using standard upscalers or latent + - implicit **hires** + since hires is only used for txt2img, control reuses existing resize functionality + any image size is used as txt2img target size + but if resize scale is also set its used to additionally upscale image after initial txt2img and for hires pass + - add support for **scripts** and **extensions** + you can now combine control workflow with your favorite script or extension + *note* extensions that are hard-coded for txt2img or img2img tabs may not work until they are updated + - add **depth-anything** depth map processor and trained controlnet + - add **marigold** depth map processor + this is state-of-the-art depth estimation model, but its quite heavy on resources + - add **openpose xl** controlnet + - add blip/booru **interrogate** functionality to both input and output images + - configurable output folder in settings + - auto-refresh available models on tab activate + - add image preview for override images set per-unit + - more compact unit layout + - reduce usage of temp files + - add context menu to action buttons + - move ip-adapter implementation to control tabs + - resize by now applies to input image or frame individually + allows for processing where input images are of different sizes + - support controlnets with non-default yaml config files + - implement resize modes for override images + - allow any selection of units + - dynamically install depenencies required by specific processors + - fix input image size + - fix video color mode + - fix correct image mode + - fix batch/folder/video modes + - fix processor switching within same unit + - fix pipeline switching between different modes +- **Face** module + implements all variations of **FaceID**, **FaceSwap** and latest **PhotoMaker** and **InstantID** + simply select from scripts and choose your favorite method and model + *note*: all models are auto-downloaded on first use + - [FaceID](https://huggingface.co/h94/IP-Adapter-FaceID) + - faceid guides image generation given the input image + - full implementation for *SD15* and *SD-XL*, to use simply select from *Scripts* + **Base** (93MB) uses *InsightFace* to generate face embeds and *OpenCLIP-ViT-H-14* (2.5GB) as image encoder + **Plus** (150MB) uses *InsightFace* to generate face embeds and *CLIP-ViT-H-14-laion2B* (3.8GB) as image encoder + **SDXL** (1022MB) uses *InsightFace* to generate face embeds and *OpenCLIP-ViT-bigG-14* (3.7GB) as image encoder + - [FaceSwap](https://github.com/deepinsight/insightface/blob/master/examples/in_swapper/README.md) + - face swap performs face swapping at the end of generation + - based on InsightFace in-swapper + - [PhotoMaker](https://github.com/TencentARC/PhotoMaker) + - for *SD-XL* only + - new model from TenencentARC using similar concept as IPAdapter, but with different implementation and + allowing full concept swaps between input images and generated images using trigger words + - note: trigger word must match exactly one term in prompt for model to work + - [InstantID](https://github.com/InstantID/InstantID) + - for *SD-XL* only + - based on custom trained ip-adapter and controlnet combined concepts + - note: controlnet appears to be heavily watermarked + - enable use via api, thanks @trojaner +- [IPAdapter](https://huggingface.co/h94/IP-Adapter) + - additional models for *SD15* and *SD-XL*, to use simply select from *Scripts*: + **SD15**: Base, Base ViT-G, Light, Plus, Plus Face, Full Face + **SDXL**: Base SDXL, Base ViT-H SDXL, Plus ViT-H SDXL, Plus Face ViT-H SDXL + - enable use via api, thanks @trojaner +- [Segmind SegMoE](https://github.com/segmind/segmoe) + - initial support for reference models + download&load via network -> models -> reference -> **SegMoE SD 4x2** (3.7GB), **SegMoE XL 2x1** (10GB), **SegMoE XL 4x2** + - note: since segmoe is basically sequential mix of unets from multiple models, it can get large + SD 4x2 is ~4GB, XL 2x1 is ~10GB and XL 4x2 is 18GB + - supports lora, thanks @AI-Casanova + - support for create and load custom mixes will be added in the future +- [Mixture Tiling](https://arxiv.org/abs/2302.02412) + - uses multiple prompts to guide different parts of the grid during diffusion process + - can be used ot create complex scenes with multiple subjects + - simply select from scripts +- [Self-attention guidance](https://github.com/SusungHong/Self-Attention-Guidance) + - simply select scale in advanced menu + - can drastically improve image coherence as well as reduce artifacts + - note: only compatible with some schedulers +- [FreeInit](https://tianxingwu.github.io/pages/FreeInit/) for **AnimateDiff** + - greatly improves temporal consistency of generated outputs + - all options are available in animateddiff script +- [SalesForce BlipDiffusion](https://huggingface.co/docs/diffusers/api/pipelines/blip_diffusion) + - model can be used to place subject in a different context + - requires input image + - last word in prompt and negative prompt will be used as source and target subjects + - sampler must be set to default before loading the model +- [InstaFlow](https://github.com/gnobitab/InstaFlow) + - another take on super-fast image generation in a single step + - set *sampler:default, steps:1, cfg-scale:0* + - load from networks -> models -> reference +- **Improvements** + - **ui** + - check version and **update** SD.Next via UI + simply go to: settings -> update + - globally configurable **font size** + will dynamically rescale ui depending on settings -> user interface + - built-in **themes** can be changed on-the-fly + this does not work with gradio-default themes as css is created by gradio itself + - two new **themes**: *simple-dark* and *simple-light* + - modularized blip/booru interrogate + now appears as toolbuttons on image/gallery output + - faster browser page load + - update hints, thanks @brknsoul + - cleanup settings + - **server** + - all move/offload options are disable by default for optimal performance + enable manually if low on vram + - **server startup**: performance + - reduced module imports + ldm support is now only loaded when running in backend=original + - faster extension load + - faster json parsing + - faster lora indexing + - lazy load optional imports + - batch embedding load, thanks @midcoastal and @AI-Casanova + 10x+ faster embeddings load for large number of embeddings, now works for 1000+ embeddings + - file and folder list caching, thanks @midcoastal + if you have a lot of files and and/or are using slower or non-local storage, this speeds up file access a lot + - add `SD_INSTALL_DEBUG` env variable to trace all `git` and `pip` operations + - **extra networks** + - 4x faster civitai metadata and previews lookup + - better display and selection of tags & trigger words + if hashes are calculated, trigger words will only be displayed for actual model version + - better matching of previews + - better search, including searching for multiple keywords or using full regex + see wiki page for more details on syntax + thanks @NetroScript + - reduce html overhead + - **model compression**, thanks @Disty0 + - using built-in NNCF model compression, you can reduce the size of your models significantly + example: up to 3.4GB of VRAM saved for SD-XL model! + - see [wiki](https://github.com/vladmandic/automatic/wiki/Model-Compression-with-NNCF) for details + - **embeddings** + you can now use sd 1.5 embeddings with your sd-xl models!, thanks @AI-Casanova + conversion is done on-the-fly, is completely transparent and result is an approximation of embedding + to enable: settings->extra networks->auto-convert embeddings + - **offline deployment**: allow deployment without git clone + for example, you can now deploy a zip of the sdnext folder + - **latent upscale**: updated latent upscalers (some are new) + *nearest, nearest-exact, area, bilinear, bicubic, bilinear-antialias, bicubic-antialias* + - **scheduler**: added `SA Solver` + - **model load to gpu** + new option in settings->diffusers allowing models to be loaded directly to GPU while keeping RAM free + this option is not compatible with any kind of model offloading as model is expected to stay in GPU + additionally, all model-moves can now be traced with env variable `SD_MOVE_DEBUG` + - **xyz grid** + - range control + example: `5.0-6.0:3` will generate 3 images with values `5.0,5.5,6.0` + example: `10-20:4` will generate 4 images with values `10,13,16,20` + - continue on error + now you can use xyz grid with different params and test which ones work and which dont + - correct font scaling, thanks @nCoderGit + - **hypertile** + - enable vae tiling + - add autodetect optimial value + set tile size to 0 to use autodetected value + - **cli** + - `sdapi.py` allow manual api invoke + example: `python cli/sdapi.py /sdapi/v1/sd-models` + - `image-exif.py` improve metadata parsing + - `install-sf` helper script to automatically find best available stable-fast package for the platform + - **memory**: add ram usage monitoring in addition to gpu memory usage monitoring + - **vae**: enable taesd batch decode + enable/disable with settings -> diffusers > vae slicing +- **compile** + - new option: **fused projections** + pretty much free 5% performance boost for compatible models + enable in settings -> compute settings + - new option: **dynamic quantization** (experimental) + reduces memory usage and increases performance + enable in settings -> compute settings + best used together with torch compile: *inductor* + this feature is highly experimental and will evolve over time + requires nightly versions of `torch` and `torchao` + > `pip install -U --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121` + > `pip install -U git+https://github.com/pytorch-labs/ao` + - new option: **compile text encoder** (experimental) +- **correction** + - new section in generate, allows for image corrections during generataion directly in latent space + - adds *brightness*, *sharpness* and *color* controls, thanks @AI-Casanova + - adds *color grading* controls, thanks @AI-Casanova + - replaces old **hdr** section +- **IPEX**, thanks @disty0 + - see [wiki](https://github.com/vladmandic/automatic/wiki/Intel-ARC) for details + - rewrite ipex hijacks without CondFunc + improves compatibilty and performance + fixes random memory leaks + - out of the box support for Intel Data Center GPU Max Series + - remove IPEX / Torch 2.0 specific hijacks + - add `IPEX_SDPA_SLICE_TRIGGER_RATE`, `IPEX_ATTENTION_SLICE_RATE` and `IPEX_FORCE_ATTENTION_SLICE` env variables + - disable 1024x1024 workaround if the GPU supports 64 bit + - fix lock-ups at very high resolutions +- **OpenVINO**, thanks @disty0 + - see [wiki](https://github.com/vladmandic/automatic/wiki/OpenVINO) for details + - **quantization support with NNCF** + run 8 bit directly without autocast + enable *OpenVINO Quantize Models with NNCF* from *Compute Settings* + - **4-bit support with NNCF** + enable *Compress Model weights with NNCF* from *Compute Settings* and set a 4-bit NNCF mode + select both CPU and GPU from the device selection if you want to use 4-bit or 8-bit modes on GPU + - experimental support for *Text Encoder* compiling + OpenVINO is faster than IPEX now + - update to OpenVINO 2023.3.0 + - add device selection to `Compute Settings` + selecting multiple devices will use `HETERO` device + - remove `OPENVINO_TORCH_BACKEND_DEVICE` env variable + - reduce system memory usage after compile + - fix cache loading with multiple models +- **Olive** support, thanks @lshqqytiger + - fully merged in in [wiki](https://github.com/vladmandic/automatic/wiki/ONNX-Runtime-&-Olive), see wiki for details + - as a highlight, 4-5 it/s using DirectML on AMD GPU translates to 23-25 it/s using ONNX/Olive! +- **fixes** + - civitai model download: enable downloads of embeddings + - ipadapter: allow changing of model/image on-the-fly + - ipadapter: fix fallback of cross-attention on unload + - rebasin iterations, thanks @AI-Casanova + - prompt scheduler, thanks @AI-Casanova + - python: fix python 3.9 compatibility + - sdxl: fix positive prompt embeds + - img2img: clip and blip interrogate + - img2img: sampler selection offset + - img2img: support variable aspect ratio without explicit resize + - cli: add `simple-upscale.py` script + - cli: fix cmd args parsing + - cli: add `run-benchmark.py` script + - api: add `/sdapi/v1/version` endpoint + - api: add `/sdapi/v1/platform` endpoint + - api: return current image in progress api if requested + - api: sanitize response object + - api: cleanup error logging + - api: fix api-only errors + - api: fix image to base64 + - api: fix upscale + - refiner: fix use of sd15 model as refiners in second pass + - refiner: enable none as option in xyz grid + - sampler: add sampler options info to metadata + - sampler: guard against invalid sampler index + - sampler: add img2img_extra_noise option + - config: reset default cfg scale to 6.0 + - hdr: fix math, thanks @AI-Casanova + - processing: correct display metadata + - processing: fix batch file names + - live preview: fix when using `bfloat16` + - live preview: add thread locking + - upscale: fix ldsr + - huggingface: handle fallback model variant on load + - reference: fix links to models and use safetensors where possible + - model merge: unbalanced models where not all keys are present, thanks @AI-Casanova + - better sdxl model detection + - global crlf->lf switch + - model type switch if there is loaded submodels + - cleanup samplers use of compute devices, thanks @Disty0 +- **other** + - extensions `sd-webui-controlnet` is locked to commit `ecd33eb` due to breaking changes + - extension `stable-diffusion-webui-images-browser` is locked to commit `27fe4a7` due to breaking changes + - updated core requirements + - fully dynamic pipelines + pipeline switch is now done on-the-fly and does not require manual initialization of individual components + this allows for quick implementation of new pipelines + see `modules/sd_models.py:switch_pipe` for details + - major internal ui module refactoring + this may cause compatibility issues if an extension is doing a direct import from `ui.py` + in which case, report it so we can add a compatibility layer + - major public api refactoring + this may cause compatibility issues if an extension is doing a direct import from `api.py` or `models.py` + in which case, report it so we can add a compatibility layer + +## Update for 2023-12-29 + +To wrap up this amazing year, were releasing a new version of [SD.Next](https://github.com/vladmandic/automatic), this one is absolutely massive! + +### Highlights 2023-12-29 + +- Brand new Control module for *text, image, batch and video* processing + Native implementation of all control methods for both *SD15* and *SD-XL* + ▹ **ControlNet | ControlNet XS | Control LLLite | T2I Adapters | IP Adapters** + For details, see [Wiki](https://github.com/vladmandic/automatic/wiki/Control) documentation: +- Support for new models types out-of-the-box + This brings number of supported t2i/i2i model families to 13! + ▹ **Stable Diffusion 1.5/2.1 | SD-XL | LCM | Segmind | Kandinsky | Pixart-α | Würstchen | aMUSEd | DeepFloyd IF | UniDiffusion | SD-Distilled | BLiP Diffusion | etc.** +- New video capabilities: + ▹ **AnimateDiff | SVD | ModelScope | ZeroScope** +- Enhanced platform support + ▹ **Windows | Linux | MacOS** with **nVidia | AMD | IntelArc | DirectML | OpenVINO | ONNX+Olive** backends +- Better onboarding experience (first install) + with all model types available for single click download & load (networks -> reference) +- Performance optimizations! + For comparisment of different processing options and compile backends, see [Wiki](https://github.com/vladmandic/automatic/wiki/Benchmark) + As a highlight, were reaching **~100 it/s** (no tricks, this is with full features enabled and end-to-end on a standard nVidia RTX4090) +- New [custom pipelines](https://github.com/vladmandic/automatic/blob/dev/scripts/example.py) framework for quickly porting any new pipeline + +And others improvements in areas such as: Upscaling (up to 8x now with 40+ available upscalers), Inpainting (better quality), Prompt scheduling, new Sampler options, new LoRA types, additional UI themes, better HDR processing, built-in Video interpolation, parallel Batch processing, etc. + +Plus some nifty new modules such as **FaceID** automatic face guidance using embeds during generation and **Depth 3D** image to 3D scene + +### Full ChangeLog 2023-12-29 + +- **Control** + - native implementation of all image control methods: + **ControlNet**, **ControlNet XS**, **Control LLLite**, **T2I Adapters** and **IP Adapters** + - top-level **Control** next to **Text** and **Image** generate + - supports all variations of **SD15** and **SD-XL** models + - supports *Text*, *Image*, *Batch* and *Video* processing + - for details and list of supported models and workflows, see Wiki documentation: + +- **Diffusers** + - [Segmind Vega](https://huggingface.co/segmind/Segmind-Vega) model support + - small and fast version of **SDXL**, only 3.1GB in size! + - select from *networks -> reference* + - [aMUSEd 256](https://huggingface.co/amused/amused-256) and [aMUSEd 512](https://huggingface.co/amused/amused-512) model support + - lightweigt models that excel at fast image generation + - *note*: must select: settings -> diffusers -> generator device: unset + - select from *networks -> reference* + - [Playground v1](https://huggingface.co/playgroundai/playground-v1), [Playground v2 256](https://huggingface.co/playgroundai/playground-v2-256px-base), [Playground v2 512](https://huggingface.co/playgroundai/playground-v2-512px-base), [Playground v2 1024](https://huggingface.co/playgroundai/playground-v2-1024px-aesthetic) model support + - comparable to SD15 and SD-XL, trained from scratch for highly aesthetic images + - simply select from *networks -> reference* and use as usual + - [BLIP-Diffusion](https://dxli94.github.io/BLIP-Diffusion-website/) + - img2img model that can replace subjects in images using prompt keywords + - download and load by selecting from *networks -> reference -> blip diffusion* + - in image tab, select `blip diffusion` script + - [DemoFusion](https://github.com/PRIS-CV/DemoFusion) run your SDXL generations at any resolution! + - in **Text** tab select *script* -> *demofusion* + - *note*: GPU VRAM limits do not automatically go away so be careful when using it with large resolutions + in the future, expect more optimizations, especially related to offloading/slicing/tiling, + but at the moment this is pretty much experimental-only + - [AnimateDiff](https://github.com/guoyww/animatediff/) + - overall improved quality + - can now be used with *second pass* - enhance, upscale and hires your videos! + - [IP Adapter](https://github.com/tencent-ailab/IP-Adapter) + - add support for **ip-adapter-plus_sd15, ip-adapter-plus-face_sd15 and ip-adapter-full-face_sd15** + - can now be used in *xyz-grid* + - **Text-to-Video** + - in text tab, select `text-to-video` script + - supported models: **ModelScope v1.7b, ZeroScope v1, ZeroScope v1.1, ZeroScope v2, ZeroScope v2 Dark, Potat v1** + *if you know of any other t2v models youd like to see supported, let me know!* + - models are auto-downloaded on first use + - *note*: current base model will be unloaded to free up resources + - **Prompt scheduling** now implemented for Diffusers backend, thanks @AI-Casanova + - **Custom pipelines** contribute by adding your own custom pipelines! + - for details, see fully documented example: + + - **Schedulers** + - add timesteps range, changing it will make scheduler to be over-complete or under-complete + - add rescale betas with zero SNR option (applicable to Euler, Euler a and DDIM, allows for higher dynamic range) + - **Inpaint** + - improved quality when using mask blur and padding + - **UI** + - 3 new native UI themes: **orchid-dreams**, **emerald-paradise** and **timeless-beige**, thanks @illu_Zn + - more dynamic controls depending on the backend (original or diffusers) + controls that are not applicable in current mode are now hidden + - allow setting of resize method directly in image tab + (previously via settings -> upscaler_for_img2img) +- **Optional** + - **FaceID** face guidance during generation + - also based on IP adapters, but with additional face detection and external embeddings calculation + - calculates face embeds based on input image and uses it to guide generation + - simply select from *scripts -> faceid* + - *experimental module*: requirements must be installed manually: + > pip install insightface ip_adapter + - **Depth 3D** image to 3D scene + - delivered as an extension, install from extensions tab + + - creates fully compatible 3D scene from any image by using depth estimation + and creating a fully populated mesh + - scene can be freely viewed in 3D in the UI itself or downloaded for use in other applications + - [ONNX/Olive](https://github.com/vladmandic/automatic/wiki/ONNX-Olive) + - major work continues in olive branch, see wiki for details, thanks @lshqqytiger + as a highlight, 4-5 it/s using DirectML on AMD GPU translates to 23-25 it/s using ONNX/Olive! +- **General** + - new **onboarding** + - if no models are found during startup, app will no longer ask to download default checkpoint + instead, it will show message in UI with options to change model path or download any of the reference checkpoints + - *extra networks -> models -> reference* section is now enabled for both original and diffusers backend + - support for **Torch 2.1.2** (release) and **Torch 2.3** (dev) + - **Process** create videos from batch or folder processing + supports *GIF*, *PNG* and *MP4* with full interpolation, scene change detection, etc. + - **LoRA** + - add support for block weights, thanks @AI-Casanova + example `` + - add support for LyCORIS GLora networks + - add support for LoRA PEFT (*Diffusers*) networks + - add support for Lora-OFT (*Kohya*) and Lyco-OFT (*Kohaku*) networks + - reintroduce alternative loading method in settings: `lora_force_diffusers` + - add support for `lora_fuse_diffusers` if using alternative method + use if you have multiple complex loras that may be causing performance degradation + as it fuses lora with model during load instead of interpreting lora on-the-fly + - **CivitAI downloader** allow usage of access tokens for download of gated or private models + - **Extra networks** new *settting -> extra networks -> build info on first access* + indexes all networks on first access instead of server startup + - **IPEX**, thanks @disty0 + - update to **Torch 2.1** + if you get file not found errors, set `DISABLE_IPEXRUN=1` and run the webui with `--reinstall` + - built-in *MKL* and *DPCPP* for IPEX, no need to install OneAPI anymore + - **StableVideoDiffusion** is now supported with IPEX + - **8 bit support with NNCF** on Diffusers backend + - fix IPEX Optimize not applying with Diffusers backend + - disable 32bit workarounds if the GPU supports 64bit + - add `DISABLE_IPEXRUN` and `DISABLE_IPEX_1024_WA` environment variables + - performance and compatibility improvements + - **OpenVINO**, thanks @disty0 + - **8 bit support for CPUs** + - reduce System RAM usage + - update to Torch 2.1.2 + - add *Directory for OpenVINO cache* option to *System Paths* + - remove Intel ARC specific 1024x1024 workaround + - **HDR controls** + - batch-aware for enhancement of multiple images or video frames + - available in image tab + - **Logging** + - additional *TRACE* logging enabled via specific env variables + see for details + - improved profiling + use with `--debug --profile` + - log output file sizes + - **Other** + - **API** several minor but breaking changes to API behavior to better align response fields, thanks @Trojaner + - **Inpaint** add option `apply_overlay` to control if inpaint result should be applied as overlay or as-is + can remove artifacts and hard edges of inpaint area but also remove some details from original + - **chaiNNer** fix `NaN` issues due to autocast + - **Upscale** increase limit from 4x to 8x given the quality of some upscalers + - **Extra Networks** fix sort + - reduced default **CFG scale** from 6 to 4 to be more out-of-the-box compatibile with LCM/Turbo models + - disable google fonts check on server startup + - fix torchvision/basicsr compatibility + - fix styles quick save + - add hdr settings to metadata + - improve handling of long filenames and filenames during batch processing + - do not set preview samples when using via api + - avoid unnecessary resizes in img2img and inpaint + - safe handling of config updates avoid file corruption on I/O errors + - updated `cli/simple-txt2img.py` and `cli/simple-img2img.py` scripts + - save `params.txt` regardless of image save status + - update built-in log monitor in ui, thanks @midcoastal + - major CHANGELOG doc cleanup, thanks @JetVarimax + - major INSTALL doc cleanup, thanks JetVarimax + +## Update for 2023-12-04 + +Whats new? Native video in SD.Next via both **AnimateDiff** and **Stable-Video-Diffusion** - and including native MP4 encoding and smooth video outputs out-of-the-box, not just animated-GIFs. +Also new is support for **SDXL-Turbo** as well as new **Kandinsky 3** models and cool latent correction via **HDR controls** for any *txt2img* workflows, best-of-class **SDXL model merge** using full ReBasin methods and further mobile UI optimizations. + +- **Diffusers** + - **IP adapter** + - lightweight native implementation of T2I adapters which can guide generation towards specific image style + - supports most T2I models, not limited to SD 1.5 + - models are auto-downloaded on first use + - for IP adapter support in *Original* backend, use standard *ControlNet* extension + - **AnimateDiff** + - lightweight native implementation of AnimateDiff models: + *AnimateDiff 1.4, 1.5 v1, 1.5 v2, AnimateFace* + - supports SD 1.5 only + - models are auto-downloaded on first use + - for video saving support, see video support section + - can be combined with IP-Adapter for even better results! + - for AnimateDiff support in *Original* backend, use standard *AnimateDiff* extension + - **HDR latent control**, based on [article](https://huggingface.co/blog/TimothyAlexisVass/explaining-the-sdxl-latent-space#long-prompts-at-high-guidance-scales-becoming-possible) + - in *Advanced* params + - allows control of *latent clamping*, *color centering* and *range maximization* + - supported by *XYZ grid* + - [SD21 Turbo](https://huggingface.co/stabilityai/sd-turbo) and [SDXL Turbo]() support + - just set CFG scale (0.0-1.0) and steps (1-3) to a very low value + - compatible with original StabilityAI SDXL-Turbo or any of the newer merges + - download safetensors or select from networks -> reference + - [Stable Video Diffusion](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid) and [Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt) support + - download using built-in model downloader or simply select from *networks -> reference* + support for manually downloaded safetensors models will be added later + - for video saving support, see video support section + - go to *image* tab, enter input image and select *script* -> *stable video diffusion* + - [Kandinsky 3](https://huggingface.co/kandinsky-community/kandinsky-3) support + - download using built-in model downloader or simply select from *networks -> reference* + - this model is absolutely massive at 27.5GB at fp16, so be patient + - model params count is at 11.9B (compared to SD-XL at 3.3B) and its trained on mixed resolutions from 256px to 1024px + - use either model offload or sequential cpu offload to be able to use it + - better autodetection of *inpaint* and *instruct* pipelines + - support long seconary prompt for refiner +- **Video support** + - applies to any model that supports video generation, e.g. AnimateDiff and StableVideoDiffusion + - support for **animated-GIF**, **animated-PNG** and **MP4** + - GIF and PNG can be looped + - MP4 can have additional padding at the start/end as well as motion-aware interpolated frames for smooth playback + interpolation is done using [RIFE](https://arxiv.org/abs/2011.06294) with native implementation in SD.Next + And its fast - interpolation from 16 frames with 10x frames to target 160 frames results takes 2-3sec + - output folder for videos is in *settings -> image paths -> video* +- **General** + - redesigned built-in profiler + - now includes both `python` and `torch` and traces individual functions + - use with `--debug --profile` + - **model merge** add **SD-XL ReBasin** support, thanks @AI-Casanova + - further UI optimizations for **mobile devices**, thanks @iDeNoh + - log level defaults to info for console and debug for log file + - better prompt display in process tab + - increase maximum lora cache values + - fix extra networks sorting + - fix controlnet compatibility issues in original backend + - fix img2img/inpaint paste params + - fix save text file for manually saved images + - fix python 3.9 compatibility issues + +## Update for 2023-11-23 + +New release, primarily focused around three major new features: full **LCM** support, completely new **Model Merge** functionality and **Stable-fast** compile support +Also included are several other improvements and large number of hotfixes - see full changelog for details + +- **Diffusers** + - **LCM** support for any *SD 1.5* or *SD-XL* model! + - download [lcm-lora-sd15](https://huggingface.co/latent-consistency/lcm-lora-sdv1-5/tree/main) and/or [lcm-lora-sdxl](https://huggingface.co/latent-consistency/lcm-lora-sdxl/tree/main) + - load for favorite *SD 1.5* or *SD-XL* model *(original LCM was SD 1.5 only, this is both)* + - load **lcm lora** *(note: lcm lora is processed differently than any other lora)* + - set **sampler** to **LCM** + - set number of steps to some low number, for SD-XL 6-7 steps is normally sufficient + note: LCM scheduler does not support steps higher than 50 + - set CFG to between 1 and 2 + - Add `cli/lcm-convert.py` script to convert any SD 1.5 or SD-XL model to LCM model + by baking in LORA and uploading to Huggingface, thanks @Disty0 + - Support for [Stable Fast](https://github.com/chengzeyi/stable-fast) model compile on *Windows/Linux/WSL2* with *CUDA* + See [Wiki:Benchmark](https://github.com/vladmandic/automatic/wiki/Benchmark) for details and comparison + of different backends, precision modes, advanced settings and compile modes + *Hint*: **70+ it/s** is possible on *RTX4090* with no special tweaks + - Add additional pipeline types for manual model loads when loading from `safetensors` + - Updated logic for calculating **steps** when using base/hires/refiner workflows + - Improve **model offloading** for both model and sequential cpu offload when dealing with meta tensors + - Safe model offloading for non-standard models + - Fix **DPM SDE** scheduler + - Better support for SD 1.5 **inpainting** models + - Add support for **OpenAI Consistency decoder VAE** + - Enhance prompt parsing with long prompts and support for *BREAK* keyword + Change-in-behavior: new line in prompt now means *BREAK* + - Add alternative Lora loading algorithm, triggered if `SD_LORA_DIFFUSERS` is set +- **Models** + - **Model merge** + - completely redesigned, now based on best-of-class `meh` by @s1dlx + and heavily modified for additional functionality and fully integrated by @AI-Casanova (thanks!) + - merge SD or SD-XL models using *simple merge* (12 methods), + using one of *presets* (20 built-in presets) or custom block merge values + - merge with ReBasin permutations and/or clipping protection + - fully multithreaded for fastest merge possible + - **Model update** + - under UI -> Models - Update + - scan existing models for updated metadata on CivitAI and + provide download functionality for models with available +- **Extra networks** + - Use multi-threading for 5x load speedup + - Better Lora trigger words support + - Auto refresh styles on change +- **General** + - Many **mobile UI** optimizations, thanks @iDeNoh + - Support for **Torch 2.1.1** with CUDA 12.1 or CUDA 11.8 + - Configurable location for HF cache folder + Default is standard `~/.cache/huggingface/hub` + - Reworked parser when pasting previously generated images/prompts + includes all `txt2img`, `img2img` and `override` params + - Reworked **model compile** + - Support custom upscalers in subfolders + - Add additional image info when loading image in process tab + - Better file locking when sharing config and/or models between multiple instances + - Handle custom API endpoints when using auth + - Show logged in user in log when accessing via UI and/or API + - Support `--ckpt none` to skip loading a model +- **XYZ grid** + - Add refiner options to XYZ Grid + - Add option to create only subgrids in XYZ grid, thanks @midcoastal + - Allow custom font, background and text color in settings +- **Fixes** + - Fix `params.txt` saved before actual image + - Fix inpaint + - Fix manual grid image save + - Fix img2img init image save + - Fix upscale in txt2img for batch counts when no hires is used + - More uniform models paths + - Safe scripts callback execution + - Improved extension compatibility + - Improved BF16 support + - Match previews for reference models with downloaded models + +## Update for 2023-11-06 + +Another pretty big release, this time with focus on new models (3 new model types), new backends and optimizations +Plus quite a few fixes + +Also, [Wiki](https://github.com/vladmandic/automatic/wiki) has been updated with new content, so check it out! +Some highlights: [OpenVINO](https://github.com/vladmandic/automatic/wiki/OpenVINO), [IntelArc](https://github.com/vladmandic/automatic/wiki/Intel-ARC), [DirectML](https://github.com/vladmandic/automatic/wiki/DirectML), [ONNX/Olive](https://github.com/vladmandic/automatic/wiki/ONNX-Olive) + +- **Diffusers** + - since now **SD.Next** supports **12** different model types, weve added reference model for each type in + *Extra networks -> Reference* for easier select & auto-download + Models can still be downloaded manually, this is just a convenience feature & a showcase for supported models + - new model type: [Segmind SSD-1B](https://huggingface.co/segmind/SSD-1B) + its a *distilled* model trained at 1024px, this time 50% smaller and faster version of SD-XL! + (and quality does not suffer, its just more optimized) + test shows batch-size:4 with 1k images at full quality used less than 6.5GB of VRAM + and for further optimization, you can use built-in **TAESD** decoder, + which results in batch-size:16 with 1k images using 7.9GB of VRAM + select from extra networks -> reference or download using built-in **Huggingface** downloader: `segmind/SSD-1B` + - new model type: [Pixart-α XL 2](https://github.com/PixArt-alpha/PixArt-alpha) + in medium/512px and large/1024px variations + comparable in quality to SD 1.5 and SD-XL, but with better text encoder and highly optimized training pipeline + so finetunes can be done in as little as 10% compared to SD/SD-XL (note that due to much larger text encoder, it is a large model) + select from extra networks -> reference or download using built-in **Huggingface** downloader: `PixArt-alpha/PixArt-XL-2-1024-MS` + - new model type: [LCM: Latent Consistency Models](https://github.com/openai/consistency_models) + trained at 512px, but with near-instant generate in a as little as 3 steps! + combined with OpenVINO, generate on CPU takes less than 5-10 seconds: + and absolute beast when combined with **HyperTile** and **TAESD** decoder resulting in **28 FPS** + (on RTX4090 for batch 16x16 at 512px) + note: set sampler to **Default** before loading model as LCM comes with its own *LCMScheduler* sampler + select from extra networks -> reference or download using built-in **Huggingface** downloader: `SimianLuo/LCM_Dreamshaper_v7` + - support for **Custom pipelines**, thanks @disty0 + download using built-in **Huggingface** downloader + think of them as plugins for diffusers not unlike original extensions that modify behavior of `ldm` backend + list of community pipelines: + - new custom pipeline: `Disty0/zero123plus-pipeline`, thanks @disty0 + generate 4 output images with different camera positions: front, side, top, back! + for more details, see + - new backend: **ONNX/Olive** *(experimental)*, thanks @lshqqytiger + for details, see [WiKi](https://github.com/vladmandic/automatic/wiki/ONNX-Runtime) + - extend support for [Free-U](https://github.com/ChenyangSi/FreeU) + improve generations quality at no cost (other than finding params that work for you) +- **General** + - attempt to auto-fix invalid samples which occur due to math errors in lower precision + example: `RuntimeWarning: invalid value encountered in cast: sample = sample.astype(np.uint8)` + begone **black images** *(note: if it proves as working, this solution will need to be expanded to cover all scenarios)* + - add **Lora OFT** support, thanks @antis0007 and @ai-casanova + - **Upscalers** + - **compile** option, thanks @disty0 + - **chaiNNer** add high quality models from [Helaman](https://openmodeldb.info/users/helaman) + - redesigned **Progress bar** with full details on current operation + - new option: *settings -> images -> keep incomplete* + can be used to skip vae decode on aborted/skipped/interrupted image generations + - new option: *settings -> system paths -> models* + can be used to set custom base path for *all* models (previously only as cli option) + - remove external clone of items in `/repositories` + - **Interrogator** module has been removed from `extensions-builtin` + and fully implemented (and improved) natively +- **UI** + - UI tweaks for default themes + - UI switch core font in default theme to **noto-sans** + previously default font was simply *system-ui*, but it lead to too much variations between browsers and platforms + - UI tweaks for mobile devices, thanks @iDeNoh + - updated **Context menu** + right-click on any button in action menu (e.g. generate button) +- **Extra networks** + - sort by name, size, date, etc. + - switch between *gallery* and *list* views + - add tags from user metadata (in addition to tags in model metadata) for **lora** + - added **Reference** models for diffusers backend + - faster enumeration of all networks on server startup +- **Packages** + - updated `diffusers` to 0.22.0, `transformers` to 4.34.1 + - update **openvino**, thanks @disty0 + - update **directml**, @lshqqytiger +- **Compute** + - **OpenVINO**: + - updated to mainstream `torch` *2.1.0* + - support for **ESRGAN** upscalers +- **Fixes** + - fix **freeu** for backend original and add it to xyz grid + - fix loading diffuser models in huggingface format from non-standard location + - fix default styles looking in wrong location + - fix missing upscaler folder on initial startup + - fix handling of relative path for models + - fix simple live preview device mismatch + - fix batch img2img + - fix diffusers samplers: dpm++ 2m, dpm++ 1s, deis + - fix new style filename template + - fix image name template using model name + - fix image name sequence + - fix model path using relative path + - fix safari/webkit layour, thanks @eadnams22 + - fix `torch-rocm` and `tensorflow-rocm` version detection, thanks @xangelix + - fix **chainner** upscalers color clipping + - fix for base+refiner workflow in diffusers mode: number of steps, diffuser pipe mode + - fix for prompt encoder with refiner in diffusers mode + - fix prompts-from-file saving incorrect metadata + - fix add/remove extra networks to prompt + - fix before-hires step + - fix diffusers switch from invalid model + - force second requirements check on startup + - remove **lyco**, multiple_tqdm + - enhance extension compatibility for extensions directly importing codeformers + - enhance extension compatibility for extensions directly accessing processing params + - **css** fixes + - clearly mark external themes in ui + - update `typing-extensions` + +## Update for 2023-10-17 + +This is a major release, with many changes and new functionality... + +Changelog is massive, but do read through or youll be missing on some very cool new functionality +or even free speedups and quality improvements (regardless of which workflows youre using)! + +Note that for this release its recommended to perform a clean install (e.g. fresh `git clone`) +Upgrades are still possible and supported, but clean install is recommended for best experience + +- **UI** + - added **change log** to UI + see *System -> Changelog* + - converted submenus from checkboxes to accordion elements + any ui state including state of open/closed menus can be saved as default! + see *System -> User interface -> Set menu states* + - new built-in theme **invoked** + thanks @BinaryQuantumSoul + - add **compact view** option in settings -> user interface + - small visual indicator bottom right of page showing internal server job state +- **Extra networks**: + - **Details** + - new details interface to view and save data about extra networks + main ui now has a single button on each en to trigger details view + - details view includes model/lora metadata parser! + - details view includes civitai model metadata! + - **Metadata**: + - you can scan [civitai](https://civitai.com/) + for missing metadata and previews directly from extra networks + simply click on button in top-right corner of extra networks page + - **Styles** + - save/apply icons moved to extra networks + - can be edited in details view + - support for single or multiple styles per json + - support for embedded previews + - large database of art styles included by default + can be disabled in *settings -> extra networks -> show built-in* + - styles can also be used in a prompt directly: `` + if style if an exact match, it will be used + otherwise it will rotate between styles that match the start of the name + that way you can use different styles as wildcards when processing batches + - styles can have **extra** fields, not just prompt and negative prompt + for example: *"Extra: sampler: Euler a, width: 480, height: 640, steps: 30, cfg scale: 10, clip skip: 2"* + - **VAE** + - VAEs are now also listed as part of extra networks + - Image preview methods have been redesigned: simple, approximate, taesd, full + please set desired preview method in settings + - both original and diffusers backend now support "full quality" setting + if you desired model or platform does not support FP16 and/or you have a low-end hardware and cannot use FP32 + you can disable "full quality" in advanced params and it will likely reduce decode errors (infamous black images) + - **LoRA** + - LoRAs are now automatically filtered based on compatibility with currently loaded model + note that if lora type cannot be auto-determined, it will be left in the list + - **Refiner** + - you can load model from extra networks as base model or as refiner + simply select button in top-right of models page + - **General** + - faster search, ability to show/hide/sort networks + - refactored subfolder handling + *note*: this will trigger model hash recalculation on first model use +- **Diffusers**: + - better pipeline **auto-detect** when loading from safetensors + - **SDXL Inpaint** + - although any model can be used for inpainiting, there is a case to be made for + dedicated inpainting models as they are tuned to inpaint and not generate + - model can be used as base model for **img2img** or refiner model for **txt2img** + To download go to *Models -> Huggingface*: + - `diffusers/stable-diffusion-xl-1.0-inpainting-0.1` *(6.7GB)* + - **SDXL Instruct-Pix2Pix** + - model can be used as base model for **img2img** or refiner model for **txt2img** + this model is massive and requires a lot of resources! + to download go to *Models -> Huggingface*: + - `diffusers/sdxl-instructpix2pix-768` *(11.9GB)* + - **SD Latent Upscale** + - you can use *SD Latent Upscale* models as **refiner models** + this is a bit experimental, but it works quite well! + to download go to *Models -> Huggingface*: + - `stabilityai/sd-x2-latent-upscaler` *(2.2GB)* + - `stabilityai/stable-diffusion-x4-upscaler` *(1.7GB)* + - better **Prompt attention** + should better handle more complex prompts + for sdxl, choose which part of prompt goes to second text encoder - just add `TE2:` separator in the prompt + for hires and refiner, second pass prompt is used if present, otherwise primary prompt is used + new option in *settings -> diffusers -> sdxl pooled embeds* + thanks @AI-Casanova + - better **Hires** support for SD and SDXL + - better **TI embeddings** support for SD and SDXL + faster loading, wider compatibility and support for embeddings with multiple vectors + information about used embedding is now also added to image metadata + thanks @AI-Casanova + - better **Lora** handling + thanks @AI-Casanova + - better **SDXL preview** quality (approx method) + thanks @BlueAmulet + - new setting: *settings -> diffusers -> force inpaint* + as some models behave better when in *inpaint* mode even for normal *img2img* tasks +- **Upscalers**: + - pretty much a rewrite and tons of new upscalers - built-in list is now at **42** + - fix long outstanding memory leak in legacy code, amazing this went undetected for so long + - more high quality upscalers available by default + **SwinIR** (2), **ESRGAN** (12), **RealESRGAN** (6), **SCUNet** (2) + - if that is not enough, there is new **chaiNNer** integration: + adds 15 more upscalers from different families out-of-the-box: + **HAT** (6), **RealHAT** (2), **DAT** (1), **RRDBNet** (1), **SPSRNet** (1), **SRFormer** (2), **SwiftSR** (2) + and yes, you can download and add your own, just place them in `models/chaiNNer` + - two additional latent upscalers based on SD upscale models when using Diffusers backend + **SD Upscale 2x**, **SD Upscale 4x*** + note: Recommended usage for *SD Upscale* is by using second pass instead of upscaler + as it allows for tuning of prompt, seed, sampler settings which are used to guide upscaler + - upscalers are available in **xyz grid** + - simplified *settings->postprocessing->upscalers* + e.g. all upsamplers share same settings for tiling + - allow upscale-only as part of **txt2img** and **img2img** workflows + simply set *denoising strength* to 0 so hires does not get triggered + - unified init/download/execute/progress code + - easier installation +- **Samplers**: + - moved ui options to submenu + - default list for new installs is now all samplers, list can be modified in settings + - simplified samplers configuration in settings + plus added few new ones like sigma min/max which can highly impact sampler behavior + - note that list of samplers is now *different* since keeping a flat-list of all possible + combinations results in 50+ samplers which is not practical + items such as algorithm (e.g. karras) is actually a sampler option, not a sampler itself +- **CivitAI**: + - civitai model download is now multithreaded and resumable + meaning that you can download multiple models in parallel + as well as resume aborted/incomplete downloads + - civitai integration in *models -> civitai* can now find most + previews AND metadata for most models (checkpoints, loras, embeddings) + metadata is now parsed and saved in *[model].json* + typical hit rate is >95% for models, loras and embeddings + - description from parsed model metadata is used as model description if there is no manual + description file present in format of *[model].txt* + - to enable search for models, make sure all models have set hash values + *Models -> Valida -> Calculate hashes* +- **LoRA** + - new unified LoRA handler for all LoRA types (lora, lyco, loha, lokr, locon, ia3, etc.) + applies to both original and diffusers backend + thanks @AI-Casanova for diffusers port + - for *backend:original*, separate lyco handler has been removed +- **Compute** + - **CUDA**: + - default updated to `torch` *2.1.0* with cuda *12.1* + - testing moved to `torch` *2.2.0-dev/cu122* + - check out *generate context menu -> show nvml* for live gpu stats (memory, power, temp, clock, etc.) + - **Intel Arc/IPEX**: + - tons of optimizations, built-in binary wheels for Windows + i have to say, intel arc/ipex is getting to be quite a player, especially with openvino + thanks @Disty0 @Nuullll + - **AMD ROCm**: + - updated installer to support detect `ROCm` *5.4/5.5/5.6/5.7* + - support for `torch-rocm-5.7` + - **xFormers**: + - default updated to *0.0.23* + - note that latest xformers are still not compatible with cuda 12.1 + recommended to use torch 2.1.0 with cuda 11.8 + if you attempt to use xformers with cuda 12.1, it will force a full xformers rebuild on install + which can take a very long time and may/may-not work + - added cmd param `--use-xformers` to force usage of exformers + - **GC**: + - custom garbage collect threshold to reduce vram memory usage, thanks @Disty0 + see *settings -> compute -> gc* +- **Inference** + - new section in **settings** + - [HyperTile](https://github.com/tfernd/HyperTile): new! + available for *diffusers* and *original* backends + massive (up to 2x) speed-up your generations for free :) + *note: hypertile is not compatible with any extension that modifies processing parameters such as resolution* + thanks @tfernd + - [Free-U](https://github.com/ChenyangSi/FreeU): new! + available for *diffusers* and *original* backends + improve generations quality at no cost (other than finding params that work for you) + *note: temporarily disabled for diffusers pending release of diffusers==0.22* + thanks @ljleb + - [Token Merging](https://github.com/dbolya/tomesd): not new, but updated + available for *diffusers* and *original* backends + speed-up your generations by merging redundant tokens + speed up will depend on how aggressive you want to be with token merging + - **Batch mode** + new option *settings -> inference -> batch mode* + when using img2img process batch, optionally process multiple images in batch in parallel + thanks @Symbiomatrix +- **NSFW Detection/Censor** + - install extension: [NudeNet](https://github.com/vladmandic/sd-extension-nudenet) + body part detection, image metadata, advanced censoring, etc... + works for *text*, *image* and *process* workflows + more in the extension notes +- **Extensions** + - automatic discovery of new extensions on github + no more waiting for them to appear in index! + - new framework for extension validation + extensions ui now shows actual status of extensions for reviewed extensions + if you want to contribute/flag/update extension status, reach out on github or discord + - better overall compatibility with A1111 extensions (up to a point) + - [MultiDiffusion](https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111) + has been removed from list of built-in extensions + you can still install it manually if desired + - [LyCORIS] + has been removed from list of built-in extensions + it is considered obsolete given that all functionality is now built-in +- **General** + - **Startup** + - all main CLI parameters can now be set as environment variable as well + for example `--data-dir ` can be specified as `SD_DATADIR=` before starting SD.Next + - **XYZ Grid** + - more flexibility to use selection or strings + - **Logging** + - get browser session info in server log + - allow custom log file destination + see `webui --log` + - when running with `--debug` flag, log is force-rotated + so each `sdnext.log.*` represents exactly one server run + - internal server job state tracking + - **Launcher** + - new `webui.ps1` powershell launcher for windows (old `webui.bat` is still valid) + thanks @em411 + - **API** + - add end-to-end example how to use API: `cli/simple-txt2img.js` + covers txt2img, upscale, hires, refiner + - **train.py** + - wrapper script around built-in **kohyas lora** training script + see `cli/train.py --help` + new support for sd and sdxl, thanks @evshiron + new support for full offline mode (without sdnext server running) +- **Themes** + - all built-in themes are fully supported: + - *black-teal (default), light-teal, black-orange, invoked, amethyst-nightfall, midnight-barbie* + - if youre using any **gradio default** themes or a **3rd party** theme or that are not optimized for SD.Next, you may experience issues + default minimal style has been updated for compatibility, but actual styling is completely outside of SD.Next control + +## Update for 2023-09-13 + +Started as a mostly a service release with quite a few fixes, but then... +Major changes how **hires** works as well as support for a very interesting new model [Wuerstchen](https://huggingface.co/blog/wuertschen) + +- tons of fixes +- changes to **hires** + - enable non-latent upscale modes (standard upscalers) + - when using latent upscale, hires pass is run automatically + - when using non-latent upscalers, hires pass is skipped by default + enabled using **force hires** option in ui + hires was not designed to work with standard upscalers, but i understand this is a common workflow + - when using refiner, upscale/hires runs before refiner pass + - second pass can now also utilize full/quick vae quality + - note that when combining non-latent upscale, hires and refiner output quality is maximum, + but operations are really resource intensive as it includes: *base->decode->upscale->encode->hires->refine* + - all combinations of: decode full/quick + upscale none/latent/non-latent + hires on/off + refiner on/off + should be supported, but given the number of combinations, issues are possible + - all operations are captured in image metadata +- diffusers: + - allow loading of sd/sdxl models from safetensors without online connectivity + - support for new model: [wuerstchen](https://huggingface.co/warp-ai/wuerstchen) + its a high-resolution model (1024px+) thats ~40% faster than sd-xl with a bit lower resource requirements + go to *models -> huggingface -> search "warp-ai/wuerstchen" -> download* + its nearly 12gb in size, so be patient :) +- minor re-layout of the main ui +- updated **ui hints** +- updated **models -> civitai** + - search and download loras + - find previews for already downloaded models or loras +- new option **inference mode** + - default is standard `torch.no_grad` + new option is `torch.inference_only` which is slightly faster and uses less vram, but only works on some gpus +- new cmdline param `--no-metadata` + skips reading metadata from models that are not already cached +- updated **gradio** +- **styles** support for subfolders +- **css** optimizations +- clean-up **logging** + - capture system info in startup log + - better diagnostic output + - capture extension output + - capture ldm output + - cleaner server restart + - custom exception handling + +## Update for 2023-09-06 + +One week later, another large update! + +- system: + - full **python 3.11** support + note that changing python version does require reinstall + and if youre already on python 3.10, really no need to upgrade +- themes: + - new default theme: **black-teal** + - new light theme: **light-teal** + - new additional theme: **midnight-barbie**, thanks @nyxia +- extra networks: + - support for **tags** + show tags on hover, search by tag, list tags, add to prompt, etc. + - **styles** are now also listed as part of extra networks + existing `styles.csv` is converted upon startup to individual styles inside `models/style` + this is stage one of new styles functionality + old styles interface is still available, but will be removed in future + - cache file lists for much faster startup + speedups are 50+% for large number of extra networks + - ui refresh button now refreshes selected page, not all pages + - simplified handling of **descriptions** + now shows on-mouse-over without the need for user interaction + - **metadata** and **info** buttons only show if there is actual content +- diffusers: + - add full support for **textual inversions** (embeddings) + this applies to both sd15 and sdxl + thanks @ai-casanova for porting compel/sdxl code + - mix&match **base** and **refiner** models (*experimental*): + most of those are "because why not" and can result in corrupt images, but some are actually useful + also note that if youre not using actual refiner model, you need to bump refiner steps + as normal models are not designed to work with low step count + and if youre having issues, try setting prompt parser to "fixed attention" as majority of problems + are due to token mismatches when using prompt attention + - any sd15 + any sd15 + - any sd15 + sdxl-refiner + - any sdxl-base + sdxl-refiner + - any sdxl-base + any sd15 + - any sdxl-base + any sdxl-base + - ability to **interrupt** (stop/skip) model generate + - added **aesthetics score** setting (for sdxl) + used to automatically guide unet towards higher pleasing images + highly recommended for simple prompts + - added **force zeros** setting + create zero-tensor for prompt if prompt is empty (positive or negative) +- general: + - `rembg` remove backgrounds support for **is-net** model + - **settings** now show markers for all items set to non-default values + - **metadata** refactored how/what/when metadata is added to images + should result in much cleaner and more complete metadata + - pre-create all system folders on startup + - handle model load errors gracefully + - improved vram reporting in ui + - improved script profiling (when running in debug mode) + +## Update for 2023-08-30 + +Time for a quite a large update that has been leaking bit-by-bit over the past week or so... +*Note*: due to large changes, it is recommended to reset (delete) your `ui-config.json` + +- diffusers: + - support for **distilled** sd models + just go to models/huggingface and download a model, for example: + `segmind/tiny-sd`, `segmind/small-sd`, `segmind/portrait-finetuned` + those are lower quality, but extremely small and fast + up to 50% faster than sd 1.5 and execute in as little as 2.1gb of vram +- general: + - redesigned **settings** + - new layout with separated sections: + *settings, ui config, licenses, system info, benchmark, models* + - **system info** tab is now part of settings + when running outside of sdnext, system info is shown in main ui + - all system and image paths are now relative by default + - add settings validation when performing load/save + - settings tab in ui now shows settings that are changed from default values + - settings tab switch to compact view + - update **gradio** major version + this may result in some smaller layout changes since its a major version change + however, browser page load is now much faster + - optimizations: + - optimize model hashing + - add cli param `--skip-all` that skips all installer checks + use at personal discretion, but it can be useful for bulk deployments + - add model **precompile** option (when model compile is enabled) + - **extra network** folder info caching + results in much faster startup when you have large number of extra networks + - faster **xyz grid** switching + especially when using different checkpoints + - update **second pass** options for clarity + - models: + - civitai download missing model previews + - add **openvino** (experimental) cpu optimized model compile and inference + enable with `--use-openvino` + thanks @disty0 + - enable batch **img2img** scale-by workflows + now you can batch process with rescaling based on each individual original image size + - fixes: + - fix extra networks previews + - css fixes + - improved extensions compatibility (e.g. *sd-cn-animation*) + - allow changing **vae** on-the-fly for both original and diffusers backend + +## Update for 2023-08-20 + +Another release thats been baking in dev branch for a while... + +- general: + - caching of extra network information to enable much faster create/refresh operations + thanks @midcoastal +- diffusers: + - add **hires** support (*experimental*) + applies to all model types that support img2img, including **sd** and **sd-xl** + also supports all hires upscaler types as well as standard params like steps and denoising strength + when used with **sd-xl**, it can be used with or without refiner loaded + how to enable - there are no explicit checkboxes other than second pass itself: + - hires: upscaler is set and target resolution is not at default + - refiner: if refiner model is loaded + - images save options: *before hires*, *before refiner* + - redo `move model to cpu` logic in settings -> diffusers to be more reliable + note that system defaults have also changed, so you may need to tweak to your liking + - update dependencies + +## Update for 2023-08-17 + +Smaller update, but with some breaking changes (to prepare for future larger functionality)... + +- general: + - update all metadata saved with images + see for details + - improved **amd** installer with support for **navi 2x & 3x** and **rocm 5.4/5.5/5.6** + thanks @evshiron + - fix **img2img** resizing (applies to *original, diffusers, hires*) + - config change: main `config.json` no longer contains entire configuration + but only differences from defaults (similar to recent change performed to `ui-config.json`) +- diffusers: + - enable **batch img2img** workflows +- original: + - new samplers: **dpm++ 3M sde** (standard and karras variations) + enable in *settings -> samplers -> show samplers* + - expose always/never discard penultimate sigma + enable in *settings -> samplers* + +## Update for 2023-08-11 + +This is a big one thats been cooking in `dev` for a while now, but finally ready for release... + +- diffusers: + - **pipeline autodetect** + if pipeline is set to autodetect (default for new installs), app will try to autodetect pipeline based on selected model + this should reduce user errors such as loading **sd-xl** model when **sd** pipeline is selected + - **quick vae decode** as alternative to full vae decode which is very resource intensive + quick decode is based on `taesd` and produces lower quality, but its great for tests or grids as it runs much faster and uses far less vram + disabled by default, selectable in *txt2img/img2img -> advanced -> full quality* + - **prompt attention** for sd and sd-xl + supports both `full parser` and native `compel` + thanks @ai-casanova + - advanced **lora load/apply** methods + in addition to standard lora loading that was recently added to sd-xl using diffusers, now we have + - **sequential apply** (load & apply multiple loras in sequential manner) and + - **merge and apply** (load multiple loras and merge before applying to model) + see *settings -> diffusers -> lora methods* + thanks @hameerabbasi and @ai-casanova + - **sd-xl vae** from safetensors now applies correct config + result is that 3rd party vaes can be used without washed out colors + - options for optimized memory handling for lower memory usage + see *settings -> diffusers* +- general: + - new **civitai model search and download** + native support for civitai, integrated into ui as *models -> civitai* + - updated requirements + this time its a bigger change so upgrade may take longer to install new requirements + - improved **extra networks** performance with large number of networks + +## Update for 2023-08-05 + +Another minor update, but it unlocks some cool new items... + +- diffusers: + - vaesd live preview (sd and sd-xl) + - fix inpainting (sd and sd-xl) +- general: + - new torch 2.0 with ipex (intel arc) + - additional callbacks for extensions + enables latest comfyui extension + +## Update for 2023-07-30 + +Smaller release, but IMO worth a post... + +- diffusers: + - sd-xl loras are now supported! + - memory optimizations: Enhanced sequential CPU offloading, model CPU offload, FP16 VAE + - significant impact if running SD-XL (for example, but applies to any model) with only 8GB VRAM + - update packages +- minor bugfixes + +## Update for 2023-07-26 + +This is a big one, new models, new diffusers, new features and updated UI... + +First, **SD-XL 1.0** is released and yes, SD.Next supports it out of the box! + +- [SD-XL Base](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0.safetensors) +- [SD-XL Refiner](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/blob/main/sd_xl_refiner_1.0.safetensors) + +Also fresh is new **Kandinsky 2.2** model that does look quite nice: + +- [Kandinsky Decoder](https://huggingface.co/kandinsky-community/kandinsky-2-2-decoder) +- [Kandinsky Prior](https://huggingface.co/kandinsky-community/kandinsky-2-2-prior) + +Actual changelog is: + +- general: + - new loading screens and artwork + - major ui simplification for both txt2img and img2img + nothing is removed, but you can show/hide individual sections + default is very simple interface, but you can enable any sections and save it as default in settings + - themes: add additional built-in theme, `amethyst-nightfall` + - extra networks: add add/remove tags to prompt (e.g. lora activation keywords) + - extensions: fix couple of compatibility items + - firefox compatibility improvements + - minor image viewer improvements + - add backend and operation info to metadata + +- diffusers: + - were out of experimental phase and diffusers backend is considered stable + - sd-xl: support for **sd-xl 1.0** official model + - sd-xl: loading vae now applies to both base and refiner and saves a bit of vram + - sd-xl: denoising_start/denoising_end + - sd-xl: enable dual prompts + dual prompt is used if set regardless if refiner is enabled/loaded + if refiner is loaded & enabled, refiner prompt will also be used for refiner pass + - primary prompt goes to [OpenAI CLIP-ViT/L-14](https://huggingface.co/openai/clip-vit-large-patch14) + - refiner prompt goes to [OpenCLIP-ViT/bigG-14](https://huggingface.co/laion/CLIP-ViT-bigG-14-laion2B-39B-b160k) + - **kandinsky 2.2** support + note: kandinsky model must be downloaded using model downloader, not as safetensors due to specific model format + - refiner: fix batch processing + - vae: enable loading of pure-safetensors vae files without config + also enable *automatic* selection to work with diffusers + - sd-xl: initial lora support + right now this applies to official lora released by **stability-ai**, support for **kohyas** lora is expected soon + - implement img2img and inpainting (experimental) + actual support and quality depends on model + it works as expected for sd 1.5, but not so much for sd-xl for now + - implement limited stop/interrupt for diffusers + works between stages, not within steps + - add option to save image before refiner pass + - option to set vae upcast in settings + - enable fp16 vae decode when using optimized vae + this pretty much doubles performance of decode step (delay after generate is done) + +- original + - fix hires secondary sampler + this now fully obsoletes `fallback_sampler` and `force_hr_sampler_name` + + +## Update for 2023-07-18 + +While were waiting for official SD-XL release, heres another update with some fixes and enhancements... + +- **global** + - image save: option to add invisible image watermark to all your generated images + disabled by default, can be enabled in settings -> image options + watermark information will be shown when loading image such as in process image tab + also additional cli utility `/cli/image-watermark.py` to read/write/strip watermarks from images + - batch processing: fix metadata saving, also allow to drag&drop images for batch processing + - ui configuration: you can modify all ui default values from settings as usual, + but only values that are non-default will be written to `ui-config.json` + - startup: add cmd flag to skip all `torch` checks + - startup: force requirements check on each server start + there are too many misbehaving extensions that change system requirements + - internal: safe handling of all config file read/write operations + this allows sdnext to run in fully shared environments and prevents any possible configuration corruptions +- **diffusers**: + - sd-xl: remove image watermarks autocreated by 0.9 model + - vae: enable loading of external vae, documented in diffusers wiki + and mix&match continues, you can even use sd-xl vae with sd 1.5 models! + - samplers: add concept of *default* sampler to avoid needing to tweak settings for primary or second pass + note that sampler details will be printed in log when running in debug level + - samplers: allow overriding of sampler beta values in settings + - refiner: fix refiner applying only to first image in batch + - refiner: allow using direct latents or processed output in refiner + - model: basic support for one more model: [UniDiffuser](https://github.com/thu-ml/unidiffuser) + download using model downloader: `thu-ml/unidiffuser-v1` + and set resolution to 512x512 + +## Update for 2023-07-14 + +Trying to unify settings for both original and diffusers backend without introducing duplicates... + +- renamed **hires fix** to **second pass** + as that is what it actually is, name hires fix is misleading to start with +- actual **hires fix** and **refiner** are now options inside **second pass** section +- obsoleted settings -> sampler -> **force_hr_sampler_name** + it is now part of **second pass** options and it works the same for both original and diffusers backend + which means you can use different scheduler settings for txt2img and hires if you want +- sd-xl refiner will run if its loaded and if second pass is enabled + so you can quickly enable/disable refiner by simply enabling/disabling second pass +- you can mix&match **model** and **refiner** + for example, you can generate image using sd 1.5 and still use sd-xl refiner as second pass +- reorganized settings -> samplers to show which section refers to which backend +- added diffusers **lmsd** sampler + +## Update for 2023-07-13 + +Another big one, but now improvements to both **diffusers** and **original** backends as well plus ability to dynamically switch between them! + +- swich backend between diffusers and original on-the-fly + - you can still use `--backend ` and now that only means in which mode app will start, + but you can change it anytime in ui settings + - for example, you can even do things like generate image using sd-xl, + then switch to original backend and perform inpaint using a different model +- diffusers backend: + - separate ui settings for refiner pass with sd-xl + you can specify: prompt, negative prompt, steps, denoise start + - fix loading from pure safetensors files + now you can load sd-xl from safetensors file or from huggingface folder format + - fix kandinsky model (2.1 working, 2.2 was just released and will be soon) +- original backend: + - improvements to vae/unet handling as well as cross-optimization heads + in non-technical terms, this means lower memory usage and higher performance + and you should be able to generate higher resolution images without any other changes +- other: + - major refactoring of the javascript code + includes fixes for text selections and navigation + - system info tab now reports on nvidia driver version as well + - minor fixes in extra-networks + - installer handles origin changes for submodules + +big thanks to @huggingface team for great communication, support and fixing all the reported issues asap! + + +## Update for 2023-07-10 + +Service release with some fixes and enhancements: + +- diffusers: + - option to move base and/or refiner model to cpu to free up vram + - model downloader options to specify model variant / revision / mirror + - now you can download `fp16` variant directly for reduced memory footprint + - basic **img2img** workflow (*sketch* and *inpaint* are not supported yet) + note that **sd-xl** img2img workflows are architecturaly different so it will take longer to implement + - updated hints for settings +- extra networks: + - fix corrupt display on refesh when new extra network type found + - additional ui tweaks + - generate thumbnails from previews only if preview resolution is above 1k +- image viewer: + - fixes for non-chromium browsers and mobile users and add option to download image + - option to download image directly from image viewer +- general + - fix startup issue with incorrect config + - installer should always check requirements on upgrades + +## Update for 2023-07-08 + +This is a massive update which has been baking in a `dev` branch for a while now + +- merge experimental diffusers support + +*TL;DR*: Yes, you can run **SD-XL** model in **SD.Next** now +For details, see Wiki page: [Diffusers](https://github.com/vladmandic/automatic/wiki/Diffusers) +Note this is still experimental, so please follow Wiki +Additional enhancements and fixes will be provided over the next few days +*Thanks to @huggingface team for making this possible and our internal @team for all the early testing* + +Release also contains number of smaller updates: + +- add pan & zoom controls (touch and mouse) to image viewer (lightbox) +- cache extra networks between tabs + this should result in neat 2x speedup on building extra networks +- add settings -> extra networks -> do not automatically build extra network pages + speeds up app start if you have a lot of extra networks and you want to build them manually when needed +- extra network ui tweaks + +## Update for 2023-07-01 + +Small quality-of-life updates and bugfixes: + +- add option to disallow usage of ckpt checkpoints +- change lora and lyco dir without server restart +- additional filename template fields: `uuid`, `seq`, `image_hash` +- image toolbar is now shown only when image is present +- image `Zip` button gone and its not optional setting that applies to standard `Save` button +- folder `Show` button is present only when working on localhost, + otherwise its replaced with `Copy` that places image URLs on clipboard so they can be used in other apps + +## Update for 2023-06-30 + +A bit bigger update this time, but contained to specific areas... + +- change in behavior + extensions no longer auto-update on startup + using `--upgrade` flag upgrades core app as well as all submodules and extensions +- **live server log monitoring** in ui + configurable via settings -> live preview +- new **extra networks interface** + *note: if youre using a 3rd party ui extension for extra networks, it will likely need to be updated to work with new interface* + - display in front of main ui, inline with main ui or as a sidebar + - lazy load thumbnails + drastically reduces load times for large number of extra networks + - auto-create thumbnails from preview images in extra networks in a background thread + significant load time saving on subsequent restarts + - support for info files in addition to description files + - support for variable aspect-ratio thumbnails + - new folder view +- **extensions sort** by trending +- add requirements check for training + +## Update for 2023-06-26 + +- new training tab interface + - redesigned preprocess, train embedding, train hypernetwork +- new models tab interface + - new model convert functionality, thanks @akegarasu + - new model verify functionality +- lot of ipex specific fixes/optimizations, thanks @disty0 + +## Update for 2023-06-20 + +This one is less relevant for standard users, but pretty major if youre running an actual server +But even if not, it still includes bunch of cumulative fixes since last release - and going by number of new issues, this is probably the most stable release so far... +(next one is not going to be as stable, but it will be fun :) ) + +- minor improvements to extra networks ui +- more hints/tooltips integrated into ui +- new dedicated api server + - but highly promising for high throughput server +- improve server logging and monitoring with + - server log file rotation + - ring buffer with api endpoint `/sdapi/v1/log` + - real-time status and load endpoint `/sdapi/v1/system-info/status` + +## Update for 2023-06-14 + +Second stage of a jumbo merge from upstream plus few minor changes... + +- simplify token merging +- reorganize some settings +- all updates from upstream: **A1111** v1.3.2 [df004be] *(latest release)* + pretty much nothing major that i havent released in previous versions, but its still a long list of tiny changes + - skipped/did-not-port: + add separate hires prompt: unnecessarily complicated and spread over large number of commits due to many regressions + allow external scripts to add cross-optimization methods: dangerous and i dont see a use case for it so far + load extension info in threads: unnecessary as other optimizations ive already put place perform equally good + - broken/reverted: + sub-quadratic optimization changes + +## Update for 2023-06-13 + +Just a day later and one *bigger update*... +Both some **new functionality** as well as **massive merges** from upstream + +- new cache for models/lora/lyco metadata: `metadata.json` + drastically reduces disk access on app startup +- allow saving/resetting of **ui default values** + settings -> ui defaults +- ability to run server without loaded model + default is to auto-load model on startup, can be changed in settings -> stable diffusion + if disabled, model will be loaded on first request, e.g. when you click generate + useful when you want to start server to perform other tasks like upscaling which do not rely on model +- updated `accelerate` and `xformers` +- huge nubmer of changes ported from **A1111** upstream + this was a massive merge, hopefully this does not cause any regressions + and still a bit more pending... + +## Update for 2023-06-12 + +- updated ui labels and hints to improve clarity and provide some extra info + this is 1st stage of the process, more to come... + if you want to join the effort, see +- new localization and hints engine + how hints are displayed can be selected in settings -> ui +- reworked **installer** sequence + as some extensions are loading packages directly from their preload sequence + which was preventing some optimizations to take effect +- updated **settings** tab functionality, thanks @gegell + with real-time monitor for all new and/or updated settings +- **launcher** will now warn if application owned files are modified + you are free to add any user files, but do not modify app files unless youre sure in what youre doing +- add more profiling for scripts/extensions so you can see what takes time + this applies both to initial load as well as execution +- experimental `sd_model_dict` setting which allows you to load model dictionary + from one model and apply weights from another model specified in `sd_model_checkpoint` + results? who am i to judge :) + + +## Update for 2023-06-05 + +Few new features and extra handling for broken extensions +that caused my phone to go crazy with notifications over the weekend... + +- added extra networks to **xyz grid** options + now you can have more fun with all your embeddings and loras :) +- new **vae decode** method to help with larger batch sizes, thanks @bigdog +- new setting -> lora -> **use lycoris to handle all lora types** + this is still experimental, but the goal is to obsolete old built-in lora module + as it doesnt understand many new loras and built-in lyco module can handle it all +- somewhat optimize browser page loading + still slower than id want, but gradio is pretty bad at this +- profiling of scripts/extensions callbacks + you can now see how much or pre/post processing is done, not just how long generate takes +- additional exception handling so bad exception does not crash main app +- additional background removal models +- some work on bfloat16 which nobody really should be using, but why not 🙂 + + +## Update for 2023-06-02 + +Some quality-of-life improvements while working on larger stuff in the background... + +- redesign action box to be uniform across all themes +- add **pause** option next to stop/skip +- redesigned progress bar +- add new built-in extension: **agent-scheduler** + very elegant way to getting full queueing capabilities, thank @artventurdev +- enable more image formats + note: not all are understood by browser so previews and images may appear as blank + unless you have some browser extensions that can handle them + but they are saved correctly. and cant beat raw quality of 32-bit `tiff` or `psd` :) +- change in behavior: `xformers` will be uninstalled on startup if they are not active + if you do have `xformers` selected as your desired cross-optimization method, then they will be used + reason is that a lot of libaries try to blindly import xformers even if they are not selected or not functional + +## Update for 2023-05-30 + +Another bigger one...And more to come in the next few days... + +- new live preview mode: taesd + i really like this one, so its enabled as default for new installs +- settings search feature +- new sampler: dpm++ 2m sde +- fully common save/zip/delete (new) options in all tabs + which (again) meant rework of process image tab +- system info tab: live gpu utilization/memory graphs for nvidia gpus +- updated controlnet interface +- minor style changes +- updated lora, swinir, scunet and ldsr code from upstream +- start of merge from a1111 v1.3 + +## Update for 2023-05-26 + +Some quality-of-life improvements... + +- updated [README](https://github.com/vladmandic/automatic/blob/master/README.md) +- created [CHANGELOG](https://github.com/vladmandic/automatic/blob/master/CHANGELOG.md) + this will be the source for all info about new things moving forward + and cross-posted to [Discussions#99](https://github.com/vladmandic/automatic/discussions/99) as well as discord [announcements](https://discord.com/channels/1101998836328697867/1109953953396957286) +- optimize model loading on startup + this should reduce startup time significantly +- set default cross-optimization method for each platform backend + applicable for new installs only + - `cuda` => Scaled-Dot-Product + - `rocm` => Sub-quadratic + - `directml` => Sub-quadratic + - `ipex` => invokeais + - `mps` => Doggettxs + - `cpu` => Doggettxs +- optimize logging +- optimize profiling + now includes startup profiling as well as `cuda` profiling during generate +- minor lightbox improvements +- bugfixes...i dont recall when was a release with at least several of those + +other than that - first stage of [Diffusers](https://github.com/huggingface/diffusers) integration is now in master branch +i dont recommend anyone to try it (and dont even think reporting issues for it) +but if anyone wants to contribute, take a look at [project page](https://github.com/users/vladmandic/projects/1/views/1) + +## Update for 2023-05-23 + +Major internal work with perhaps not that much user-facing to show for it ;) + +- update core repos: **stability-ai**, **taming-transformers**, **k-diffusion, blip**, **codeformer** + note: to avoid disruptions, this is applicable for new installs only +- tested with **torch 2.1**, **cuda 12.1**, **cudnn 8.9** + (production remains on torch2.0.1+cuda11.8+cudnn8.8) +- fully extend support of `--data-dir` + allows multiple installations to share pretty much everything, not just models + especially useful if you want to run in a stateless container or cloud instance +- redo api authentication + now api authentication will use same user/pwd (if specified) for ui and strictly enforce it using httpbasicauth + new authentication is also fully supported in combination with ssl for both sync and async calls + if you want to use api programatically, see examples in `cli/sdapi.py` +- add dark/light theme mode toggle +- redo some `clip-skip` functionality +- better matching for vae vs model +- update to `xyz grid` to allow creation of large number of images without creating grid itself +- update `gradio` (again) +- more prompt parser optimizations +- better error handling when importing image settings which are not compatible with current install + for example, when upscaler or sampler originally used is not available +- fixes...amazing how many issues were introduced by porting a1111 v1.20 code without adding almost no new functionality + next one is v1.30 (still in dev) which does bring a lot of new features + +## Update for 2023-05-17 + +This is a massive one due to huge number of changes, +but hopefully it will go ok... + +- new **prompt parsers** + select in UI -> Settings -> Stable Diffusion + - **Full**: my new implementation + - **A1111**: for backward compatibility + - **Compel**: as used in ComfyUI and InvokeAI (a.k.a *Temporal Weighting*) + - **Fixed**: for really old backward compatibility +- monitor **extensions** install/startup and + log if they modify any packages/requirements + this is a *deep-experimental* python hack, but i think its worth it as extensions modifying requirements + is one of most common causes of issues +- added `--safe` command line flag mode which skips loading user extensions + please try to use it before opening new issue +- reintroduce `--api-only` mode to start server without ui +- port *all* upstream changes from [A1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui) + up to today - commit hash `89f9faa` + +## Update for 2023-05-15 + +- major work on **prompt parsing** + this can cause some differences in results compared to what youre used to, but its all about fixes & improvements + - prompt parser was adding commas and spaces as separate words and tokens and/or prefixes + - negative prompt weight using `[word:weight]` was ignored, it was always `0.909` + - bracket matching was anything but correct. complex nested attention brackets are now working. + - btw, if you run with `--debug` flag, youll now actually see parsed prompt & schedule +- updated all scripts in `/cli` +- add option in settings to force different **latent sampler** instead of using primary only +- add **interrupt/skip** capabilities to process images + +## Update for 2023-05-13 + +This is mostly about optimizations... + +- improved `torch-directml` support + especially interesting for **amd** users on **windows** where **torch+rocm** is not yet available + dont forget to run using `--use-directml` or default is **cpu** +- improved compatibility with **nvidia** rtx 1xxx/2xxx series gpus +- fully working `torch.compile` with **torch 2.0.1** + using `inductor` compile takes a while on first run, but does result in 5-10% performance increase +- improved memory handling + for highest performance, you can also disable aggressive **gc** in settings +- improved performance + especially *after* generate as image handling has been moved to separate thread +- allow per-extension updates in extension manager +- option to reset configuration in settings + +## Update for 2023-05-11 + +- brand new **extension manager** + this is pretty much a complete rewrite, so new issues are possible +- support for `torch` 2.0.1 + note that if you are experiencing frequent hangs, this may be a worth a try +- updated `gradio` to 3.29.0 +- added `--reinstall` flag to force reinstall of all packages +- auto-recover & re-attempt when `--upgrade` is requested but fails +- check for duplicate extensions + +## Update for 2023-05-08 + +Back online with few updates: + +- bugfixes. yup, quite a lot of those +- auto-detect some cpu/gpu capabilities on startup + this should reduce need to tweak and tune settings like no-half, no-half-vae, fp16 vs fp32, etc +- configurable order of top level tabs +- configurable order of scripts in txt2img and img2img + for both, see sections in ui-> settings -> user interface + +## Update for 2023-05-04 + +Again, few days later... + +- reviewed/ported **all** commits from **A1111** upstream + some a few are not applicable as i already have alternative implementations + and very few i choose not to implement (save/restore last-known-good-config is a bad hack) + otherwise, were fully up to date (it doesnt show on fork status as code merges were mostly manual due to conflicts) + but...due to sheer size of the updates, this may introduce some temporary issues +- redesigned server restart function + now available and working in ui + actually, since server restart is now a true restart and not ui restart, it can be used much more flexibly +- faster model load + plus support for slower devices via stream-load function (in ui settings) +- better logging + this includes new `--debug` flag for more verbose logging when troubleshooting + +## Update for 2023-05-01 + +Been a bit quieter for last few days as changes were quite significant, but finally here we are... + +- Updated core libraries: Gradio, Diffusers, Transformers +- Added support for **Intel ARC** GPUs via Intel OneAPI IPEX (auto-detected) +- Added support for **TorchML** (set by default when running on non-compatible GPU or on CPU) +- Enhanced support for AMD GPUs with **ROCm** +- Enhanced support for Apple **M1/M2** +- Redesigned command params: run `webui --help` for details +- Redesigned API and script processing +- Experimental support for multiple **Torch compile** options +- Improved sampler support +- Google Colab: + Maintained by +- Fixes, fixes, fixes... + +To take advantage of new out-of-the-box tunings, its recommended to delete your `config.json` so new defaults are applied. its not necessary, but otherwise you may need to play with UI Settings to get the best of Intel ARC, TorchML, ROCm or Apple M1/M2. + +## Update for 2023-04-27 + +a bit shorter list as: + +- ive been busy with bugfixing + there are a lot of them, not going to list each here. + but seems like critical issues backlog is quieting down and soon i can focus on new features development. +- ive started collaboration with couple of major projects, + hopefully this will accelerate future development. + +whats new: + +- ability to view/add/edit model description shown in extra networks cards +- add option to specify fallback sampler if primary sampler is not compatible with desired operation +- make clip skip a local parameter +- remove obsolete items from UI settings +- set defaults for AMD ROCm + if you have issues, you may want to start with a fresh install so configuration can be created from scratch +- set defaults for Apple M1/M2 + if you have issues, you may want to start with a fresh install so configuration can be created from scratch + +## Update for 2023-04-25 + +- update process image -> info +- add VAE info to metadata +- update GPU utility search paths for better GPU type detection +- update git flags for wider compatibility +- update environment tuning +- update ti training defaults +- update VAE search paths +- add compatibility opts for some old extensions +- validate script args for always-on scripts + fixes: deforum with controlnet + +## Update for 2023-04-24 + +- identify race condition where generate locks up while fetching preview +- add pulldowns to x/y/z script +- add VAE rollback feature in case of NaNs +- use samples format for live preview +- add token merging +- use **Approx NN** for live preview +- create default `styles.csv` +- fix setup not installing `tensorflow` dependencies +- update default git flags to reduce number of warnings + +## Update for 2023-04-23 + +- fix VAE dtype + should fix most issues with NaN or black images +- add built-in Gradio themes +- reduce requirements +- more AMD specific work +- initial work on Apple platform support +- additional PR merges +- handle torch cuda crashing in setup +- fix setup race conditions +- fix ui lightbox +- mark tensorflow as optional +- add additional image name templates + +## Update for 2023-04-22 + +- autodetect which system libs should be installed + this is a first pass of autoconfig for **nVidia** vs **AMD** environments +- fix parse cmd line args from extensions +- only install `xformers` if actually selected as desired cross-attention method +- do not attempt to use `xformers` or `sdp` if running on cpu +- merge tomesd token merging +- merge 23 PRs pending from a1111 backlog (!!) + +*expect shorter updates for the next few days as ill be partially ooo* + +## Update for 2023-04-20 + +- full CUDA tuning section in UI Settings +- improve exif/pnginfo metadata parsing + it can now handle 3rd party images or images edited in external software +- optimized setup performance and logging +- improve compatibility with some 3rd party extensions + for example handle extensions that install packages directly from github urls +- fix initial model download if no models found +- fix vae not found issues +- fix multiple git issues + +note: if you previously had command line optimizations such as --no-half, those are now ignored and moved to ui settings + +## Update for 2023-04-19 + +- fix live preview +- fix model merge +- fix handling of user-defined temp folders +- fix submit benchmark +- option to override `torch` and `xformers` installer +- separate benchmark data for system-info extension +- minor css fixes +- created initial merge backlog from pending prs on a1111 repo + see #258 for details + +## Update for 2023-04-18 + +- reconnect ui to active session on browser restart + this is one of most frequently asked for items, finally figured it out + works for text and image generation, but not for process as there is no progress bar reported there to start with +- force unload `xformers` when not used + improves compatibility with AMD/M1 platforms +- add `styles.csv` to UI settings to allow customizing path +- add `--skip-git` to cmd flags for power users that want + to skip all git checks and operations and perform manual updates +- add `--disable-queue` to cmd flags that disables Gradio queues (experimental) + this forces it to use HTTP instead of WebSockets and can help on unreliable network connections +- set scripts & extensions loading priority and allow custom priorities + fixes random extension issues: + `ScuNet` upscaler disappearing, `Additional Networks` not showing up on XYZ axis, etc. +- improve html loading order +- remove some `asserts` causing runtime errors and replace with user-friendly messages +- update README.md + +## Update for 2023-04-17 + +- **themes** are now dynamic and discovered from list of available gradio themes on huggingface + its quite a list of 30+ supported themes so far +- added option to see **theme preview** without the need to apply it or restart server +- integrated **image info** functionality into **process image** tab and removed separate **image info** tab +- more installer improvements +- fix urls +- updated github integration +- make model download as optional if no models found + +## Update for 2023-04-16 + +- support for ui themes! to to *settings* -> *user interface* -> "ui theme* + includes 12 predefined themes +- ability to restart server from ui +- updated requirements +- removed `styles.csv` from repo, its now fully under user control +- removed model-keyword extension as overly aggressive +- rewrite of the fastapi middleware handlers +- install bugfixes, hopefully new installer is now ok \ + i really want to focus on features and not troubleshooting installer + +## Update for 2023-04-15 + +- update default values +- remove `ui-config.json` from repo, its now fully under user control +- updated extensions manager +- updated locon/lycoris plugin +- enable quick launch by default +- add multidiffusion upscaler extensions +- add model keyword extension +- enable strong linting +- fix circular imports +- fix extensions updated +- fix git update issues +- update github templates + +## Update for 2023-04-14 + +- handle duplicate extensions +- redo exception handler +- fix generate forever +- enable cmdflags compatibility +- change default css font +- fix ti previews on initial start +- enhance tracebacks +- pin transformers version to last known good version +- fix extension loader + +## Update for 2023-04-12 + +This has been pending for a while, but finally uploaded some massive changes + +- New launcher + - `webui.bat` and `webui.sh`: + Platform specific wrapper scripts that starts `launch.py` in Python virtual environment + *Note*: Server can run without virtual environment, but it is recommended to use it + This is carry-over from original repo + **If youre unsure which launcher to use, this is the one you want** + - `launch.py`: + Main startup script + Can be used directly to start server in manually activated `venv` or to run it without `venv` + - `installer.py`: + Main installer, used by `launch.py` + - `webui.py`: + Main server script +- New logger +- New exception handler +- Built-in performance profiler +- New requirements handling +- Move of most of command line flags into UI Settings