sd-dreambooth-library/solo-levelling-art-style

Solo Levelling Art Style on Stable Diffusion via Dreambooth trained on the fast-DreamBooth.ipynb by TheLastBen notebook

model by Classacre

This your the Stable Diffusion model fine-tuned the Solo Levelling Art Style concept taught to Stable Diffusion with Dreambooth.

You can also train your own concepts and upload them to the library by using the fast-DremaBooth.ipynb by TheLastBen. And you can run your new concept via diffusers: Colab Notebook for Inference, Spaces with the Public Concepts loaded

This is my first model, criticism and advice is welcome. Discord: "Classacre#1028" This model is inspired by @ogkalu and his comic-diffusion model (https://huggingface.co/ogkalu/Comic-Diffusion). I think its pretty cool and you should check it out.

I've made this model out of admiration towards Jang-Sung Rak (DUBU) who recently passed away. This model is not perfect, and will never be perfect as the original artists art is irreplaceable.

Version 2.1

This new model uses the anythingv3.0 model as its base instead of the SD 1.5. This adds more dynamic backgrounds to the generations but strays abit away from the original style.
Characters and people are the same as V2 and have been improved to better reflect Jang-Sung Raks art style.
Action generations are often better in 2:1 ratios or 2:2 (1024 x 1024) generations. They are often incomplete in 512x512 generations.
The calm model simmilar to version 2.0 is a good general model and may be better than the action model when generating. Play around with the instance prompts mentioned below and see what you prefer.

The calm and action models have been combined into 1 ckpt file. I've changed the naming scheme to better match the progress of the model e.g. this versions CKPT is called sololevellingV2.1

It can be used by modifying the instance_prompt(s): SLCalm and SLAction

This model was trained using 20 total images (10 for calm scenes and 10 for action scenes). 2000 total training steps (1e-6). Text encoder trained for 250 steps (1e-6.). Text encoder concept training steps 533. 71 conceptualization (realisation) images.

This model still suffers from text/ chat bubbles but can be mitigated by adding it to the negative prompts (same as version 2.0).

Version 2.0

This is a massive improvement from the first version. I've split the model into two different models, one for non action generations (SoloLevellingCalm.ckpt) and one for action generations (SoloLevellingAction.ckpt). I plan on merging the two into one model in the future once I understand how to do captions. The calm (SoloLevellingCalm.ckpt) version of the model is great for general generation using most prompts, it was trained using non action images taken from the solo leveling manhwa.

Important Prompt Additions: Add these prompts to make the generations look remotely like the solo levelling art style and to maintain consistency. Positive prompts: anime, manhwa, beautiful, 8k Negative prompts: chat bubble, chat bubbles, ugly

This model suffers from chat bubbles and added VFX words in its generations, it can often be mitigated by inputting the negative prompts in the Important prompt additions but it is not perfect.

Sampler and CFG settings are identical to Version 1.0.

Version 1.0

It can be used by modifying the instance_prompt(s): sololeveling This model was trained using 71 training images, 14200 total training steps, model saved every 3550 steps (25%) and text encoder was trained up to 35%. Made using Stable Diffusion v1.5 as the base model.

The final model struggles to do calm / peaceful environments as it was trained on mainly cinematic action scenes - this leads to style bleeding where the ai creates action sequences from seemingly calm and peaceful prompts. Earlier models don't seem to have this problem albeit they are not as sharp and do not reproduce the style as accurately. Negative prompts seem to lessen the effects of action sequences in the final model, however they are not as natural as older models. Another thing to mention is that the model struggles at drawing eyes in action sequences, you may be able to play with the prompt to get eyes to show up though. A comparison between the different model versions can be seen below:

Sampler used: DDIM CFG: 7

Prompt: man holding a sword, black hair, muscular, in a library, cinematic, full color, fighting a man (https://i.imgur.com/MBjzUVI.jpg)

man eating food in the subway station, sololeveling, happy, cinematic, golden hour (https://i.imgur.com/L3MB4Ka.jpg)

In my opinion this model runs best using the DDIM sampler, however I'm still pretty new to experimenting samplers and my opinion about this may change in the future. Please experiment with the different samplers yourself and choose what you believe is best. The model in 106560 steps may be better than the final model.

Here are the images used for training this concept:

sololeveling