--- title: Pictero.com emoji: 💻 colorFrom: gray colorTo: yellow sdk: gradio sdk_version: 4.25.0 app_file: app.py pinned: false --- This space run on a free HF space and therefore only provides small CPU power. To run it on your own hardware, run it locally. # Local installation First clone the source code: git clone https://huggingface.co/spaces/n42/pictero Then install all required libraries, preferably inside a virtual environment: python3 -m venv .venv source .venv/bin/activate pip install -r requirements.txt Then run the web app either via Python itself or through Gradio: python app.py You can use Gradio to run the web app which offers a file watcher in case you make changes to the sources. This will hot-reload the app: gradio app.py # Walk-Through ## Patience is king If you start the process for a given model the first time, it may take a while, because the backend needs to download the full model. Depending on the model, this requires **multiple GigaByte** of space on your device. This only happens **once**, except on *Huggingface*, where the server cache will be purged. ## Steps This describes the minimal required steps to utilize this interface. Most default parameters don't need to be changed in order to work properly. 1. Select a **device**. A decent **gpu** is recommended. 2. Choose a **model**. If a fine-tuned model requires a trigger token, it will be added automatically. The **safety checker** options allows you to not render nsfw content, but in some cases this also produces black images for not harmful content. 2. You *may* select a different scheduler. The scheduler controls **quality** and **performance**. 3. Now you can define your **prompt** and **negative prompt**. 4. The value for **inference steps** controls how many iteration your generation process will run. The higher this value, the longer the process takes and the better the image quality is. You should start with a *lower value* to see how the model interpretes your prompt. As soon as you got a satisfying result, increase this value to produce high quality output. 5. The **manual seed** is a way to either force randomisastion or creation of the same output every time you run theprocess. Keep this field empty to enable random outputs. 6. Use **guidance scale** to define how strict the model interprets your prompt. 7. Hit **run**! ## Hints The **re-run** button runs the process again and only applies changes you made to the **Inference settings** section. While the **run** button execute the whole process from the scratch. You have two options to persist your selected configuration: Either you **copy the code** to an environment where you can execute Python code (Google Colab). Or, after every succesful run, head to the bottom of the page. There's a table containing a link to this interface containing the whole configuration. # Areas ## Model specific settings This allows you to select any model hosted on Huggingface. Some models are fine-tuned and require a **trigger token** to be activated, like https://huggingface.co/sd-dreambooth-library/herge-style. **Refiner** is a way to improve the quality of your image by re-processing it a second time. The pipeline supports a way to prevent nswf-content to be created. I figured this does not always work properly, so those to options allow you to disable this feature. **Attention slicing** divides attention operation into multiple steps, instead of one huge step. On machines with memory below 64 GByte or for images bigger than 512x512 pixels, this may increase performance drastically. On Apple's Silicon (M1, M1), it's recommend to keep this setting enabled. See https://huggingface.co/docs/diffusers/optimization/mps ## Scheduler/Solver This is the part of the process, that manipulates the output from the model every loop/epoch. ## Auto Encoder The auto encoder is responsible for the encoding and decoding process from the input to the output. **VAE slicing** and **VAE tiling** are parameters to improve performance here. ## Adapters Adapters allow you to modify or control the output, e.g. apply specific styles. This interface supports **Textual inversion** and **LoRA** # Customization Update the file ```appConfig.json``` to add more models. Some models need you to accept their license agreement before you can access them, like https://huggingface.co/stabilityai/stable-diffusion-3-medium. # Troubleshooting ## Result is a black image Some parameters lead to black images, deactivate them, one by one, and re-run the whole process: - safety checker - a or the wrong auto encoder - the wrong schedulder/solver (e.g. DPMMultiStep seems to be incompatible with SD15, better use DDPMScheduler) Also I faced a couple of bugs when using the attention_slicing method (see https://discuss.huggingface.co/t/activating-attention-slicing-leads-to-black-images-when-running-diffusion-more-than-once/68623): - you cannot re-run the inferencing process when using attention_slicing - don't pass cross_attention_kwargs or guidance_scale to the pipeline