Rodent Diffusion 1.5 Model Card

Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input. The Rodent-Diffusion-1-5 checkpoint was created with a custom Stable Diffusion v1.4 model as the base. From the base model, small merges (0.1-0.3) were included from the models listed below. Some keywords may exist, but for the most part you don't need anything special.

Files are located in the "Files and versions" tab.

Safetensors file

Models:

analogDiffusion
Knolling Case
RPGDiffusion
classicnegative
cuteRich
inkpunk
evoartMj4
dreamshaper
deliberate

Examples

_{Professional, full-colour, HD digital portrait photo of a hipster. Detailed, intricate hair, high definition. Focused, crisp, clear and sharp. Ultra-realistic cinematic film still. taken with the Canon m50, 50mm focal. pastel shades AND professional photo of a hipster with vivid, vibrant earthy tones. 1960s Technicolor 16mm celluloid film look. Coffee bar in the background. Decaf latte.
Negative prompt: blurry, smudge, smear, painting, anime, sketch, doodle, illustration, drawing
Steps: 42, Sampler: Euler a, CFG scale: 5.25, Seed: 3642035934, Size: 512x640, Denoising strength: 0.666, Hires upscale: 1.689, Hires upscaler: Latent (bicubic antialiased)}

_{Professional, full-colour, HD digital portrait photo of a humanoid rat. Detailed, intricate hair, high definition. Focused, crisp, clear and sharp. Ultra-realistic cinematic film still. taken with the Canon m50, 50mm focal. pastel shades AND professional photo of a rodent druid wearing amazing armour. Vibrant earthy tones. 1960s Technicolor 16mm celluloid film look. Gothic castle background.
Negative prompt: blurry, smudge, smear, painting, anime, sketch, doodle, illustration, drawing
Steps: 42, Sampler: Euler a, CFG scale: 5.25, Seed: 2537406181, Size: 512x640, Denoising strength: 0.666, Hires upscale: 1.689, Hires upscaler: Latent (bicubic antialiased)}

_{Amazing painting of a stunning African woman. Incredible hairstyle, high definition. Focused, crisp, clear and sharp. Ultra-realistic. vibrant colours. AND matte portrait painting, cute African lady from the future. Vibrant brush strokes. oil on canvas, realism, acrylic impressionism neo-science fiction aesthetic with fantasy undertones mixed to create a warm feeling. 80's look and feel
Negative prompt: 3d, render, blurry, smudge, smear, photo
Steps: 42, Sampler: Euler a, CFG scale: 5.25, Seed: 3784463462, Size: 512x640, Denoising strength: 0.666, Hires upscale: 1.689, Hires upscaler: Latent (bicubic antialiased)}

_{Anime style painting of a Tokyo street. Calm and peaceful. Relaxing. Incredible definition and detail. Crisp, clear and sharp focus. AND Anime inspired cinematic film still from the future the depicts a serene street during golden hour. Cel shading. Pastel shades and chilled vibes.
Negative prompt: 3d, render, blurry, smudge, smear, photo
Steps: 42, Sampler: Euler a, CFG scale: 5.25, Seed: 2306894277, Size: 512x640, Denoising strength: 0.666, Hires upscale: 1.689, Hires upscaler: Latent (bicubic antialiased)}

_{Matte painting of a cat, psychedelic fractal fur, illusion, ethereal AND oil painting of a surreal cat with wild, human-like eyes and a massive grin
Negative prompt: 3d, render, blurry, smudge, smear, photo
Steps: 42, Sampler: Euler a, CFG scale: 5.25, Seed: 2534465260, Size: 512x640, Denoising strength: 0.666, Hires upscale: 1.689, Hires upscaler: Latent (bicubic antialiased)}

Due to the strange licence mix, this model is for personal use only though I am working on an update with less restrictions.

Original Stable Diffusion Model Details

Developed by: Robin Rombach, Patrick Esser
Model type: Diffusion-based text-to-image generation model
Language(s): English
License: The CreativeML OpenRAIL M license is an Open RAIL M license, adapted from the work that BigScience and the RAIL Initiative are jointly carrying in the area of responsible AI licensing. See also the article about the BLOOM Open RAIL license on which our license is based.
Model Description: This is a model that can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses a fixed, pretrained text encoder (CLIP ViT-L/14) as suggested in the Imagen paper.
Resources for more information: GitHub Repository, Paper.

Cite as:

@InProceedings{Rombach_2022_CVPR,
    author    = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn},
    title     = {High-Resolution Image Synthesis With Latent Diffusion Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {10684-10695}
}

License

This model is open access and available to all, with a CreativeML OpenRAIL-M license further specifying rights and usage. The CreativeML OpenRAIL License specifies:

You can't use the model to deliberately produce nor share illegal or harmful outputs or content
The authors claims no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in the license
You may re-distribute the weights and use the model commercially and/or as a service. If you do, please be aware you have to include the same use restrictions as the ones in the license and share a copy of the CreativeML OpenRAIL-M to all your users (please read the license entirely and carefully) Please read the full license here