Very simple results...

#58
by DangerD - opened

Using this model users get this:

image.png
And now with same request i get this:

image.png
What i've missed?)

Set both width and height to 1024 and try again.

Set both width and height to 1024 and try again.

Thank you, that works, but now I'm getting error due to lack of memory, 11GB is not enough, any ideas?

Thank you, that works, but now I'm getting error due to lack of memory, 11GB is not enough, any ideas?

https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Troubleshooting
Since you are using auto1111 webui, maybe you can try the following solutions posted in their troubleshooting page.

The program needs 16gb of regular RAM to run smoothly. If you have 8gb RAM, consider making an 8gb page file/swap file, or use the --lowram option (if you have more gpu vram than ram).

Thank you, that works, but now I'm getting error due to lack of memory, 11GB is not enough, any ideas?

https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Troubleshooting
Since you are using auto1111 webui, maybe you can try the following solutions posted in their troubleshooting page.

The program needs 16gb of regular RAM to run smoothly. If you have 8gb RAM, consider making an 8gb page file/swap file, or use the --lowram option (if you have more gpu vram than ram).

Hmm, I have 64GB RAM and 11GB GPU (1080Ti)

Thank you, that works, but now I'm getting error due to lack of memory, 11GB is not enough, any ideas?

https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Troubleshooting
Since you are using auto1111 webui, maybe you can try the following solutions posted in their troubleshooting page.

The program needs 16gb of regular RAM to run smoothly. If you have 8gb RAM, consider making an 8gb page file/swap file, or use the --lowram option (if you have more gpu vram than ram).

Hmm, I have 64GB RAM and 11GB GPU (1080Ti)

https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Optimizations
Try adding --medvram or --lowvram when starting.

Hmm, I have 64GB RAM and 11GB GPU (1080Ti)

i wonder, can you share how much time it took you to generate such an image ?

for some reason i try using a simple code using gradio with RTX-2060 6G (64G OS RAM, amd ryzen 5 3400) and i get 15 hours and counting !! for asimple prompt as "car"

the code i use:

import gradio as gr
import torch
from diffusers import DiffusionPipeline


# this code runs a gradio interface that allows the user to write a prompt,
# and generate an image based on the prompt
# the gradio interface launche uses about 1.5GB of RAM, and the model uses about 4GB of VRAM on a GPU
# on the nVidia RTX-2060 GPU card it takes about 6.2 hours to generate an image with 25 steps of noise reduction


# infer function is the function that will be called when the user clicks submit
# parameters explanation can be found here: https://getimg.ai/guides/interactive-guide-to-stable-diffusion-steps-parameter
# this function receives:
# prompt - the prompt the user has entered and want the model to generate the image based on
def infer(prompt):
    # start the timer
    timer.start()
    # call the model through the pipe and supply it with the parameters, get the generated images, and return the first image
    image = pipe(prompt=prompt).images[0]
    # stop the timer
    timer.stop()
    # print the time it took to generate the image
    print(f"Time to generate image: {timer.get():.2f}s")
    # return the generated image and the time it took to generate it
    return image, f"Time to generate image: {timer.get():.2f}s"
    


# print the torch version and check if cuda is available
print(f"CUDA version: {torch.version.cuda}")
print(f"cuda is available: {torch.cuda.is_available()}")

# create a timer to measure the time it takes to generate the image
timer = gr.Timer()

# check if we are running on a GPU
if torch.cuda.is_available():
    # if yes, use a torch.float16 dtype for the model, create diffusion pipeline from a pretrained model stable-diffusion-xl-base-1.0
    pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")

    # set the device variable to cuda (nVidia GPU)
    device = "cuda"
    
    # move the model to the GPU
    pipe = pipe.to(device)
else:
    # if no, use a torch.float32 dtype for the model, create diffusion pipeline from a pretrained model stable-diffusion-xl-base-1.0
    pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0")
    
    # set the device variable to cuda (nVidia GPU)
    device = "cpu"
    
    # move the model to the CPU
    pipe = pipe.to(device)


# launch the gradio interface, set the function to infer, set the inputs to a UI element for each input parameter, set the output to an image, and launch the interface
gr.Interface(fn=infer, 
             inputs=[gr.Textbox(label = 'Prompt Input Text. 77 Token (Keyword or Symbol) Maximum')], 
             outputs=['image', 'text'],
             title = "Stable Diffusion XL 1.0", 
             description = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0", 
             article = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0").queue(max_size=5).launch()

this is what the VSCode terminal show me after 14+ hours:

VSCode-terminal.png

never mind, i changed my code to the simple hello world example in the SDXL ReadMe, reduced the number of steps from 50 to 25, and it took about 11 minutes,

now i use this code:

import time
from diffusers import DiffusionPipeline
import torch

prompt = "An astronaut riding a green horse"
numOfSteps = 25

# save the current millisecond it will be used to calculate the time it takes to generate the image
start = time.time()

# create a pipeline from a pretrained model stable-diffusion-xl-base-1.0 with torch.float16 dtype
pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")

# generate the image
generated_image = pipe(prompt=prompt, num_inference_steps=numOfSteps).images[0]

# print the time it took to generate the image
print(f"Time to generate image in {numOfSteps} steps: {time.time() - start:.2f}s")

# save the generated image to a file
generated_image.save(f'generated_image_{numOfSteps}.png')

in the gradio code i saw the GPU 6G memory always on 95% and the GPU 3D graph (on the task manager) at 100% all the time, now the memory is still occupied during the generation process, but the 3d graph is not used at all, CUDA graph is used between 85% - 100%, and GPU ran in 48c - 51c temperature, so the problem was probably gradio,

i have ran steps between 2 to 25 and these are the generate times i got so far (i noticed there is a slight influence based on the style of image generated):
stats.png

here is an example of an image i got after 6:32 minutes with 14 steps:
generated_image_14.png

and in 8:16 minutes in 17 steps:
generated_image_17.png

Hmm, I have 64GB RAM and 11GB GPU (1080Ti)

i wonder, can you share how much time it took you to generate such an image ?

~1 minute, i use automatic1111 webui. (20 steps with refiner 5 steps) with 30/10 it's much longer

thanks for the reply,

when i try generating 1024x1024 in 20 steps of generation on the "stabilityai/stable-diffusion-xl-base-1.0" model and 5 steps refinement on the "stabilityai/stable-diffusion-xl-refiner-1.0" model i see no difference in the resulting image then just using the 20 steps, i don't think the 5 steps refinement can do much good

Sign up or log in to comment