metadata

language:
  - en
tags:
  - text-to-image
  - stable-diffusion
  - stable-diffusion-xs
  - sdxs
pipeline_tag: text-to-image

Stable Diffusion XS

Model Details

Stable Diffusion XS (SDXS) is a modified version stable diffusion for fast inference.

Usage


from diffusers import DiffusionPipeline
import torch

MODEL_PATH = "sdxs"
base = DiffusionPipeline.from_pretrained(
    MODEL_PATH,
    trust_remote_code=True,
    torch_dtype=torch.float16,
    variant="fp16",
    use_safetensors=True
).to("cuda")

prompt = "柴犬、カラフルアート"
negative_prompt = ""

def tokenize_prompt(tokenizer, prompt):
    text_inputs = tokenizer(
        prompt,
        padding="max_length",
        max_length=tokenizer.model_max_length,
        truncation=True,
        return_tensors="pt",
    )
    text_input_ids = text_inputs.input_ids
    return text_input_ids

def encode_prompt(text_encoders, tokenizers, prompt, hidden_size, model_max_length=77 ):
    prompt_embeds_list = []

    for i, text_encoder in enumerate(text_encoders):
        if text_encoder is not None:
            tokenizer = tokenizers[i]

            text_input_ids = tokenize_prompt(tokenizer, prompt)
            prompt_embeds = text_encoder(
                    text_input_ids.to(text_encoders[i].device), output_hidden_states=True, return_dict=False
                )
            pooled_prompt_embeds = prompt_embeds[0]
            prompt_embeds = prompt_embeds[-1][-2]
        else:
            prompt_embeds = torch.zeros((1, model_max_length, hidden_size))
            pooled_prompt_embeds = torch.zeros((1, hidden_size)) 

        # We are only ALWAYS interested in the pooled output of the final text encoder
        prompt_embeds = prompt_embeds.to("cuda")
        bs_embed, seq_len, _ = prompt_embeds.shape
        prompt_embeds = prompt_embeds.view(bs_embed, seq_len, -1)
        prompt_embeds_list.append(prompt_embeds)

    prompt_embeds = torch.concat(prompt_embeds_list, dim=-1)
    pooled_prompt_embeds = pooled_prompt_embeds.view(bs_embed, -1)
    return prompt_embeds, pooled_prompt_embeds


prompt_embeds, pooled_prompt_embeds = encode_prompt([None, base.text_encoder],[None, base.tokenizer], prompt, 768)
negative_prompt_embeds, negative_pooled_prompt_embeds = encode_prompt([None, base.text_encoder],[None, base.tokenizer], negative_prompt, 768)

#generator = [torch.Generator(device="cuda").manual_seed(i) for i in range(4)]

image = base(
    prompt_embeds=prompt_embeds,
    pooled_prompt_embeds=pooled_prompt_embeds,
    negative_prompt_embeds=negative_prompt_embeds,
    negative_pooled_prompt_embeds=negative_pooled_prompt_embeds,
    num_inference_steps=20,
).images[0]

display(image)

Model Details

Developed by: AiArtLab
Model type: Diffusion-based text-to-image generative model
Model Description: This model is a fine-tuned model based on colorfulxl_v27.
License:

Uses

Direct Use

Research: possible research areas/tasks include:

Generation of artworks and use in design and other artistic processes.
Applications in educational or creative tools.
Research on generative models.
Safe deployment of models which have the potential to generate harmful content.
Probing and understanding the limitations and biases of generative models.

Excluded uses are described below.

Out-of-Scope Use

The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.

Limitations and Bias

Limitations

The model does not achieve perfect photorealism
The model cannot render legible text
The model struggles with more difficult tasks which involve compositionality, such as rendering an image corresponding to “A red cube on top of a blue sphere”
Faces and people in general may not be generated properly.
The autoencoding part of the model is lossy.

Bias

While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.

How to cite

@misc{SDXS, 
    url    = {[https://huggingface.co/recoilme/sdxs](https://huggingface.co/recoilme/sdxs)}, 
    title  = {Stable Diffusion XS}, 
    author = {recoilme}
}

Contact

For questions and comments about the model, please join https://aiartlab.org/.
For future announcements / information about AiArtLab AI models, research, and events, please follow Discord.
For business and partnership inquiries, please contact https://t.me/recoilme