Init image support in Playground?

#10
by Softology - opened

I am trying to get Playground to edit images by passing in an init image. I tried the following. This is the same basic code as I use with SDXL and other pipleline based models when setting an init image.

pipe = AutoPipelineForImage2Image.from_pretrained(
"playgroundai/playground-v2.5-1024px-aesthetic", torch_dtype=torch.float16, use_safetensors=True
)
pipe.enable_model_cpu_offload()
init_image = load_image("seed_image.png")
prompt = "prompt goes here"
image = pipe(
prompt=prompt,
negative_prompt="negative prompt here",
guidance_scale=3.0,
width=1024,
height=1024,
safety_checker=False,
image=init_image,
strength=0.5,
num_inference_steps=50
).images[0]
image.save("output_image.png")

Here is the source image, and results from the prompt "a female cyborg" in v2 and v2.5.

Scarlett Johansson 1024x1024.png

a female cyborg [Playground v2] 1901412987 6.png

a female cyborg [Playground v2.5] 1901412987 1.png

Any recommended settings to improve these? The init image is passed into the pipeline, but the "edited" output result is poor. Is Playground even supposed to support seed images? Espewcially v2.5. v2 works "OK" and I have been able to use it to create movies by passing the last frame in as the init image for the next frame.

eg https://www.instagram.com/p/C3qnlY3s6C9/

But with the faded out result for v2.5 it cannot be used to make movies like that.
Also, using the above code seems to cause the models to be redownload every day for some reason? Any ideas there as to why would help me save bandwidth and time. Once they download I can run playgorund multiple times OK, but the next day it will redownload the models again? Very strange.

Thanks for any ideas.

PS, you can add to your model card that Playground v2 and v2.5 are supported in Visions of Chaos now.

hi!
thanks the reporting the issue - we only started to support using playground v2.5 with img2img pipeline very recently in this PR https://github.com/huggingface/diffusers/pull/7132

Can you try it with the most recent version of diffusers? if you continue to see issue, please open an issue on our github repo!

thanks!

thanks the reporting the issue - we only started to support using playground v2.5 with img2img pipeline very recently in this PR https://github.com/huggingface/diffusers/pull/7132
Can you try it with the most recent version of diffusers? if you continue to see issue, please open an issue on our github repo!

For the most recent version, is this pip install enough and correct?
pip install git+https://github.com/huggingface/diffusers.git

I am getting different results. Not as washed out, but still not correct for low init strength values. Here are 0.3 through 0.8

0.3.png

0.4.png

0.5.png

0.6.png

0.7.png

0.8.png

They all have that red tint now and a vast difference between 0.7 and 0.8.

Which git should I be posting issues to? I can't find a Playground github. Happy to move this issue there if it gets more dev eyes on it.

Thanks.

I am encountering the same issue where, when rendering plants, they appear excessively green, and a majority of the image is rendered in green; even areas that should not be green are being tinted green, resulting in entirely incorrect colors throughout the image.

I am encountering the same issue where, when rendering plants, they appear excessively green, and a majority of the image is rendered in green; even areas that should not be green are being tinted green, resulting in entirely incorrect colors throughout the image.

Is that for v2.5 model with a seed image?
If not, please create your own issue. I am trying to keep this focused on my problem.

Yes, referring to the V2.5 model's 'render from render' mode, my problem is akin to the one mentioned above—the colors are utterly incorrect.

generated_image_1b6a31c2-2f29-4da6-9503-aafa57f4f17f.png

1.png
This is the outcome from using the 'image-to-image generation' feature in V2.5 version, where normal images are inputted, but the resulting output images have entirely incorrect pixelation.

So is that using their website to generate? Not your own code?
If so, that shows that my issue may not be with my code after all then.

hi!
thanks the reporting the issue - we only started to support using playground v2.5 with img2img pipeline very recently in this PR https://github.com/huggingface/diffusers/pull/7132

Can you try it with the most recent version of diffusers? if you continue to see issue, please open an issue on our github repo!

thanks!

To make the issue even clearer, here are 2 scripts showing the problem between using the v2 and v2.5 models. Activate your Playground venv then run these. Same basic code between them, only change is loading the v2 vs v2.5 model.

scarlett.png

Seed image I use in these scripts attached. I cannot attach txt or py files, so here they are inline...

###########################################################################################################################################################

import sys
import os
import datetime
from diffusers import DiffusionPipeline
from diffusers import AutoPipelineForImage2Image
from diffusers.utils import load_image, make_image_grid
import torch
import argparse
import numpy as np
import cv2
import PIL
from PIL import Image, ImageEnhance
from scipy.ndimage.filters import median_filter

sys.stdout.write("Parsing arguments ...\n")
sys.stdout.flush()

sys.stdout.write("Setting up init image pipeline ...\n")
sys.stdout.flush()

pipe = AutoPipelineForImage2Image.from_pretrained(
f"playgroundai/playground-v2-1024px-aesthetic",
torch_dtype=torch.float16,
use_safetensors=True,
add_watermarker=False,
variant="fp16"
)
pipe.to("cuda")

init_image = load_image("scarlett.png")

sys.stdout.write("Generating image ...\n")
sys.stdout.flush()

image = pipe(
prompt="a portrait of a zombie",
negative_prompt="",
guidance_scale=3.0,
width=1024,
height=1024,
safety_checker=False,
image=init_image,
strength=0.3,
num_inference_steps=50
).images[0]

sys.stdout.write("Saving image ...\n")
sys.stdout.flush()

image.save("scarlett_v2_0.3_output.png")

image = pipe(
prompt="a portrait of a zombie",
negative_prompt="",
guidance_scale=3.0,
width=1024,
height=1024,
safety_checker=False,
image=init_image,
strength=0.5,
num_inference_steps=50
).images[0]

sys.stdout.write("Saving image ...\n")
sys.stdout.flush()

image.save("scarlett_v2_0.5_output.png")

image = pipe(
prompt="a portrait of a zombie",
negative_prompt="",
guidance_scale=3.0,
width=1024,
height=1024,
safety_checker=False,
image=init_image,
strength=0.7,
num_inference_steps=50
).images[0]

sys.stdout.write("Saving image ...\n")
sys.stdout.flush()

image.save("scarlett_v2_0.7_output.png")

sys.stdout.write("Done\n")
sys.stdout.flush()

###########################################################################################################################################################

import sys
import os
import datetime
from diffusers import DiffusionPipeline
from diffusers import AutoPipelineForImage2Image
from diffusers.utils import load_image, make_image_grid
import torch
import argparse
import numpy as np
import cv2
import PIL
from PIL import Image, ImageEnhance
from scipy.ndimage.filters import median_filter

sys.stdout.write("Parsing arguments ...\n")
sys.stdout.flush()

sys.stdout.write("Setting up init image pipeline ...\n")
sys.stdout.flush()

pipe = AutoPipelineForImage2Image.from_pretrained(
f"playgroundai/playground-v2.5-1024px-aesthetic",
torch_dtype=torch.float16,
use_safetensors=True,
add_watermarker=False,
variant="fp16"
)
pipe.to("cuda")

init_image = load_image("scarlett.png")

sys.stdout.write("Generating image ...\n")
sys.stdout.flush()

image = pipe(
prompt="a portrait of a zombie",
negative_prompt="",
guidance_scale=3.0,
width=1024,
height=1024,
safety_checker=False,
image=init_image,
strength=0.3,
num_inference_steps=50
).images[0]

sys.stdout.write("Saving image ...\n")
sys.stdout.flush()

image.save("scarlett_v2.5_0.3_output.png")

image = pipe(
prompt="a portrait of a zombie",
negative_prompt="",
guidance_scale=3.0,
width=1024,
height=1024,
safety_checker=False,
image=init_image,
strength=0.5,
num_inference_steps=50
).images[0]

sys.stdout.write("Saving image ...\n")
sys.stdout.flush()

image.save("scarlett_v2.5_0.5_output.png")

image = pipe(
prompt="a portrait of a zombie",
negative_prompt="",
guidance_scale=3.0,
width=1024,
height=1024,
safety_checker=False,
image=init_image,
strength=0.7,
num_inference_steps=50
).images[0]

sys.stdout.write("Saving image ...\n")
sys.stdout.flush()

image.save("scarlett_v2.5_0.7_output.png")

sys.stdout.write("Done\n")
sys.stdout.flush()

###########################################################################################################################################################

Hopefully that helps?

Following your approach, I managed to resolve the issue; however, I discovered that the width and height settings I provided are not having any effect, suggesting that the output image is not resizing based on the custom dimensions I've configured.

Following your approach, I managed to resolve the issue; however, I discovered that the width and height settings I provided are not having any effect, suggesting that the output image is not resizing based on the custom dimensions I've configured.

You resolved it using my example code that shows how it is not solved? Can you share what you did to fix the code?

I copied this piece of code you provided and managed to generate images without actually integrating with SD. However, I find that the image-to-image generation results are not satisfactory. The original images were of high quality, but after the process, the outputted images tend to be degraded or damaged. Perhaps it's an issue with the keywords I used for controlling the output quality.

I copied this piece of code you provided and managed to generate images without actually integrating with SD. However, I find that the image-to-image generation results are not satisfactory. The original images were of high quality, but after the process, the outputted images tend to be degraded or damaged. Perhaps it's an issue with the keywords I used for controlling the output quality.

OK, my issue is about image to image. Using a seed image. As the code shows. Please start your own issue in the future if you have other problems. I am trying to get the devs to help my specific problem here.

Whether the restriction of using only 77 tokens for generating images is within the code or the model itself, have you resolved this issue?

Whether the restriction of using only 77 tokens for generating images is within the code or the model itself, have you resolved this issue?

That has nothing to do with my issue as shown above with the full code (my prompt length is way under 77 tokens). No, I have not resolved the issue.

I sense that you're beefing up with new features in V2.5, pal. Share your new capabilities!

I sense that you're beefing up with new features in V2.5, pal. Share your new capabilities!

Are you for real or a troll? My code is above that shows my problem. If you can help fix it please do.

Without seeing your code, it's difficult to troubleshoot. Please package and upload it; I'll help you fix the issue.

Without seeing your code, it's difficult to troubleshoot. Please package and upload it; I'll help you fix the issue.

The full code and seed image are above, but here they are zipped.
https://softology.pro/bug.zip
The problem is seed_test_v25.py gives the more messy results when using the init seed image.

I couldn't find your 'seed' parameter; it seems that you haven't provided the seed argument.

I couldn't find your 'seed' parameter; it seems that you haven't provided the seed argument.

Passing in a seed parameter is not needed and does not help here (if I set a seed the results are still the noisy red results). Did that fix it for you?

他的原理和文生图是相同的,你只需要传入SEED参数就可以了,这样就可以通过固定住SEED使图生图出来的结果每次都一样

他的原理和文生图是相同的,你只需要传入SEED参数就可以了,这样就可以通过固定住SEED使图生图出来的结果每次都一样

I understand what a seed is and what it is for, but setting one here does not fix the problem. ie I change the code to

image = pipe(
prompt="a portrait of a zombie",
negative_prompt="",
guidance_scale=3.0,
width=1024,
height=1024,
safety_checker=False,
image=init_image,
strength=0.3,
seed=12345,
num_inference_steps=50
).images[0]

and get the same red image results.

SEED是随机化的,其目的是复现图像的作用,如果你用不到他忽略就可以了

SEED是随机化的,其目的是复现图像的作用,如果你用不到他忽略就可以了

Yes, the seed is not needed to replicate the problem.

Sign up or log in to comment