damian0815's picture
Update README.md
f098146
---
license: openrail
---
This is the result of training with [a mixed style/object dataset kindly provided by @Pashahlis](val-test-7e-8.zip) at a learning rate of 1e-6 for 30 epochs (70 steps/epoch) with batch size 18.
## validation
This model was trained using Victor C Hall's excellent Stable Diffusion finetuner [EveryDream2](https://github.com/victorchall/EveryDream2trainer).
EveryDream2 configuration files for this training session are in this repo, [here](_everydream2_config).
The configuration files enable [a validation pass using a 15% split of the dataset with the noise seed held fixed during validation](https://github.com/victorchall/EveryDream2trainer/blob/main/doc/VALIDATION.md), to give the following loss curve (stitched together from two runs of 60 epochs each):
![validation graph](val-graph-base.jpg)
Although the training ran for 120 epochs in total, the validation graph suggests that the best results are going to be at some point between epoch 10 and epoch 30:
![validation graph](val-graph-sweet-spot.jpg)
This repository contains a diffusers format model for epoch 30:
![validation graph](val-graph-ep30.jpg)
It's available in [InvokeAI](https://github.com/invoke-ai) by adding the diffusers repo id `damian0815/pashahlis-val-test-1e-6-ep30`, or for manual download in .ckpt format if you're using a clumsier web UI: [pashahlis-1e-6-ep30.ckpt](pashahlis-1e-6-ep30.ckpt).
## ... but is it finished training?
Training an SD model is subjective. Picking when to stop is a trade-off between an evaluation about how well the model reproduces the training data the way you want it to, vs how flexibly it is able to apply the new training data to novel outputs.
There are some [generated image samples](grates) from each epoch to look at (generated with my python tool [grate](https://pypi.org/project/sdgrate/)).
For example, [this one (warning: huuuge image, 20,000x10,000 pixels): ![grid of images](pashahlis-val-test_as-received_lr1e-6-768x768-thumbnail.jpg)](grates/pashahlis-val-test_as-received_lr1e-6-768x768.jpg)
I'm satisfied that the training quality roughly follows the shape of the validation graph, but you might want to look at this image closely to verify for yourself that the best model is probably somewhere between epoch 30 and epoch 40.
* Notice how at epochs 30 and 40 the `ancient temple` prompt produces a variety of different temples with different seeds. By epoch 50 some weird artefacts are starting to creep in, with the results becoming progressively more monotonous and, especially beginning epoch 80, increasingly bizarre.
* The `scottish ruined castle` images only start looking `ruined` by epoch 40, but already at epoch 50 they are showing signs of rigidity, ignoring the difference in seed to produce the same style of castle each time.
* The `fantasy orchard` prompt is notably resilient, but the `vibrant fairy village` two columns over has lost all trace of `fairy village` already by epoch 40.
* The `snail wedding ceremony` survives until epoch 70, but the `brushstrokes, canvas, fine art` quality of the final prompt has been replaced by an anime aesthetic by epoch 50.
## try them yourself
If you want to try them out for yourself, other epochs are available at [damian0815/pashahlis-val-test-1e-6-ep40](https://huggingface.co/damian0815/pashahlis-val-test-1e-6-ep40), [damian0815/pashahlis-val-test-1e-6-ep80](https://huggingface.co/damian0815/pashahlis-val-test-1e-6-ep80), [damian0815/pashahlis-val-test-1e-6-ep110](https://huggingface.co/damian0815/pashahlis-val-test-1e-6-ep110).
---
license: openrail
pipeline_tag: text-to-image
---