StableDiffusionXLPipeline's watermark introduces undesirable pixel noises in generated images

#31
by newtonapple - opened

The Bug

This was copied from Diffusers Issue#4014 as a dev there asked me to message the Stability AI folks. So, I'm posting it here for more visibility.

The introduction of watermark in the StableDiffusionXLPipeline has introduced undesirable pixel artifacts in the generated images. Here is an example with 2 images one generated with watermark (1st image) and one without (2nd image). If you zoom in and look at them side-by-side, you can clearly see the pixel artifacts in the image generated with watermark.

burger-and-fries
burger-and-fries-no-wm

Reproduction

Here is small Python script that I created to mock out the pipeline.watermark object in order to compare the image generated with and without the watermark. I'm using a Apple M2 MacBook hence device = mps.

import torch
from diffusers import DiffusionPipeline

device = "mps"


def pipeline():
    model = "stabilityai/stable-diffusion-xl-base-0.9"
    torch_dtype = torch.float32
    variant = "fp32"

    return DiffusionPipeline.from_pretrained(
        model,
        torch_dtype=torch_dtype,
        use_safetensors=True,
        variant=variant,
    ).to(device)


class NoWatermark:
    def apply_watermark(self, img):
        return img


pipe = pipeline()
pipe_no_wm = pipeline()
pipe_no_wm.watermark = NoWatermark()
gen1 = torch.Generator(device="mps").manual_seed(1337)
gen2 = torch.Generator(device="mps").manual_seed(1337)

prompt = "a juicy burger with fries"
img = pipe(prompt=prompt, generator=gen1).images[0]
img_no_wm = pipe_no_wm(prompt=prompt, generator=gen2).images[0]
img.save("burger-and-fries.png")
img_no_wm.save("burger-and-fries-no-wm.png")

System Info

  • diffusers version: 0.18.1
  • Platform: macOS-13.4.1-arm64-arm-64bit
  • Python version: 3.10.12
  • PyTorch version (GPU?): 2.0.1 (False)
  • Huggingface_hub version: 0.16.4
  • Transformers version: 4.30.2
  • Accelerate version: 0.20.3
  • xFormers version: not installed
  • Using GPU in script?: yes, mps.
  • Using distributed or parallel set-up in script?: no.

kind of what I was afraid of with all the invisible watermarks. How do you even get rid of the watermark in modern engines that seem to have it on by default?

yep noticed red dots

I still don't understand any logic in this even legally. According to current draft rules by U.S. Copyright office - any produce of Ai is public domain, if it was manipulated by humans it's becoming full copyright of the manipulator, so why after applying copyright watermarking present is a question, because new real author receiving right of watermarks removal.
If this used in production of computer game (which is already imaginative product) and manipulated by studio artists - it's becoming full copyright of the studio and keeping watermark is unacceptable or can even produce visual artifacts in computer game product.
There's only one longtime example of forced watermarking consumer produce - mostly all laser printers using secret steganography of yellow dots to mark every printed page. This from 1980's was made under only one foundation - not prevent, but to find people who would laser printing money banknotes. Kinda logical, but for art and imaginative products i don't see logic and don't remember any legal case of damage by art. Even in creation of defamation materials if we take famous case of Michael Jackson - he was justified as fully not guilty after expertise of certain body characteristics which witness wasn't able to provide. Imaginative art are same like UFO photography expertise.
U.S. Supreme Court made social networks exempt from any liability, so users responsible, the only example i'm aware of is "pentagon fire art" photo, which was fabricated by media company and re-posted by accounts of big media, which regulator ignored, it's their duty of media to check sources of anything (watermarked or not), so it only proves how ordinary people creations will never be popular without media companies.
In case of caricatures, media (newspapers) makes the most role, there was no legal outcome and watermarking doing nothing here because the main point of media which makes that scandals bigger & louder - is they "protected freedom of speech and press" by that.
In conclusion, the only reason of watermarking is to tracking users and/or making slightly damaged art (exclusion reality questioning).

I think it's worth pointing out that the watermark is only added when you use diffusers's StableDiffusionXLPipeline to generate the images.

it (the watermark) is also in SGM. it doesn't add the watermark via the XL pipeline if you use latent output. it seems like a bug that the img2img pipeline does not apply the watermark. i will open with HF to ensure that happens.

image.png

it (the watermark) is also in SGM. it doesn't add the watermark via the XL pipeline if you use latent output. it seems like a bug that the img2img pipeline does not apply the watermark. i will open with HF to ensure that happens.

@ptx0 , as I pointed out in my initial bug report, the watermark library was specifically added to the XL pipeline. I'm not too familiar with the all the ins-and-outs of the code base. This is just from my findings poking around.

i see what you mean, i thought you meant that in a different way. you are correct, the old pipelines have a watermark but SDXL uses a different one - not sure why.

Thanks a lot for the helpful discussion here! We could / should maybe try out to replace the watermark with the tree-ring watermark technique: https://github.com/YuxinWenRick/tree-ring-watermark

I think it's worth pointing out that the watermark is only added when you use diffusers's StableDiffusionXLPipeline to generate the images.

It's also in the img2img pipeline - think it's everywhere really

PyTorch == 1.13.0
transformers == 4.23.1
diffusers == 0.11.1
Note: higher diffusers version may not be compatible with the DDIM inversion code.

That library might need a bit of love, first.

I'm just wondering why is it now this need of a watermark? In my case I'm working in my master's thesis and it's related to data augmentation using syntethic images to improve classification models, in particular I was using the previous sd versions together with the dreambooth script. However for XL when I ran the script I had an error because of missing the libraryinvisible-watermark, so I had to install it. I'm just worried about these watermarks changing the data distribution of the real data because of new patterns that I may not see but are present in these (not so) invisible watermarks. As a way to bypass them for anyone that is looking for a solution, luckly there's a way found here that should work.

@patrickvonplaten when you say "It's also in the img2img pipeline - think it's everywhere really", do you mean eveywhere in all XL related stuff right? I'm assuming previous sd versions don't use any watermarking technique behind the scenes :D

Thanks!

it's required by the license, mate

@ptx0 yes of course, but my question was deeper than that, thinking again that we already have used sd models before without issues and now they come up with watermarking.. I don't know if it's related with people concerns about AI and secure the use of it and they are just taking steps ahead on that matter.

I'm just wondering why is it now this need of a watermark? In my case I'm working in my master's thesis and it's related to data augmentation using syntethic images to improve classification models, in particular I was using the previous sd versions together with the dreambooth script. However for XL when I ran the script I had an error because of missing the libraryinvisible-watermark, so I had to install it. I'm just worried about these watermarks changing the data distribution of the real data because of new patterns that I may not see but are present in these (not so) invisible watermarks. As a way to bypass them for anyone that is looking for a solution, luckly there's a way found here that should work.

@patrickvonplaten when you say "It's also in the img2img pipeline - think it's everywhere really", do you mean eveywhere in all XL related stuff right? I'm assuming previous sd versions don't use any watermarking technique behind the scenes :D

Thanks!

have you tried other uis like comfy ui?

i just made a tutorial for that but not sure if that is also suffering from this or not

ComfyUI Master Tutorial - Stable Diffusion XL (SDXL) - Install On PC, Google Colab (Free) & RunPod

image

I'm just wondering why is it now this need of a watermark? In my case I'm working in my master's thesis and it's related to data augmentation using syntethic images to improve classification models, in particular I was using the previous sd versions together with the dreambooth script. However for XL when I ran the script I had an error because of missing the libraryinvisible-watermark, so I had to install it. I'm just worried about these watermarks changing the data distribution of the real data because of new patterns that I may not see but are present in these (not so) invisible watermarks. As a way to bypass them for anyone that is looking for a solution, luckly there's a way found here that should work.

@patrickvonplaten when you say "It's also in the img2img pipeline - think it's everywhere really", do you mean eveywhere in all XL related stuff right? I'm assuming previous sd versions don't use any watermarking technique behind the scenes :D

Thanks!

have you tried other uis like comfy ui?

i just made a tutorial for that but not sure if that is also suffering from this or not

please man, stop STOP spamming the forums here with your youtube channel, No one is interested in it, not relevant to the topic at all

I'm just wondering why is it now this need of a watermark? In my case I'm working in my master's thesis and it's related to data augmentation using syntethic images to improve classification models, in particular I was using the previous sd versions together with the dreambooth script. However for XL when I ran the script I had an error because of missing the libraryinvisible-watermark, so I had to install it. I'm just worried about these watermarks changing the data distribution of the real data because of new patterns that I may not see but are present in these (not so) invisible watermarks. As a way to bypass them for anyone that is looking for a solution, luckly there's a way found here that should work.

@patrickvonplaten when you say "It's also in the img2img pipeline - think it's everywhere really", do you mean eveywhere in all XL related stuff right? I'm assuming previous sd versions don't use any watermarking technique behind the scenes :D

Thanks!

have you tried other uis like comfy ui?

i just made a tutorial for that but not sure if that is also suffering from this or not

please man, stop STOP spamming the forums here with your youtube channel, No one is interested in it, not relevant to the topic at all

i dont know if you are reading but i gave an idea for the person to try

how it is not relevant have you tested comfyui ?

Nobody can know if it's relevant without watching a probably incredibly annoying video. It would be relevant if YOU had actually tested it for watermarking and then written a simple bit of text saying something like, "oh hai, the watermark isn't there in comfyUI. Here's a link to the github, buf if you're having trouble following the 4 lines of installation instructions there I have a video tutorial like 40 other people on youtube that you can check out." instead of spamming your youtube channel via a link to a video on setting up a UI that only took me 10 minutes to get up and running on the AMD platform with their github instructions via directml which might as well be unsupported for the amount of stuff that just crashes on it and I won't install git on windows until there's a native version so I had to manually download a repo or two on top of that.

"But my video shows how to set up SDXL in ComfyUI too!",

ComfyUI saves the entire node tree and settings used to create an image in the .png it generates by default, so the polite thing to do would be to link to the github, then post a PNG generated with SDXL in it that anyone can open in the UI and get the entire node layout instantly without watching your video. Then maybe explain that since pipelines aren't used in Comfy directly (rather they're chained together and executed from the tree step-wise) there's some chance the watermarking won't execute. Maybe even do enough testing to find out what conditions will make that happen. I won't even click on the video, because youtube removed downvotes and me clicking is just going to feed idiots like you ad revenue (and I already know it's not going to be worth downloading from ytp-dlp from the cover image).

"But," you shout, "I'm a CONTENT CREATOR!!!11 NOT A SPAMMER."

I create content too, usually several times a day. Unlike youtubers I have the courtesy to flush afterwards.

Thanks a lot for the helpful discussion here! We could / should maybe try out to replace the watermark with the tree-ring watermark technique: https://github.com/YuxinWenRick/tree-ring-watermark

@patrickvonplaten What is your current thought on this? From my own experiments, I found multiple issues with the current watermark technique:

  • JPEG Compression can drop a lot the performance (for example from 85% to 56% bit match)
  • Even with no postprocessing at all, sometimes the decoding doesn't work. Below is my simple pipeline
pipeline_text2image = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda")

prompt = "photo of dog"
for i in range(10):
    image = pipeline_text2image(prompt=prompt).images[0]
    watermark_dec = water_maker_bits.decode(np.array(image))

From the 10 images I generated, sometimes only 27/48 bits match

Thanks

P/s: From all HuggingFace public spaces, I found no image that has watermark, same conclusion from a thread from reddit: https://www.reddit.com/r/StableDiffusion/comments/16duvd6/why_how_to_check_invisible_watermark/

Anyone can confirm this? Or do I miss something? Thanks

Continuing my comment above, I found different behaviors of watermark decoding between Diffusers (works) and Hosted inference API (doesn't work). Maybe it's the reason all the public spaces don't have watermark

You find below the code to reproduce the results:

Using Diffusers

from diffusers import AutoPipelineForText2Image
import torch
from imwatermark import WatermarkDecoder, WatermarkEncoder
import matplotlib.pyplot as plt
from PIL import Image
import numpy as np
import torch
import cv2
import io

class WaterMaker:
    def __init__(self, watermark):
        self.watermark = watermark
        self.encoder = WatermarkEncoder()
        self.data_format = 'bits'
        self.watermark_bits = np.array([int(bit) for bit in bin(self.watermark)[2:]])
        num_bits = len(self.watermark_bits)
        self.decoder = WatermarkDecoder("bits", num_bits)
    
    def decode(self, img):
        "The Image must be in RGB format"
        watermark = self.decoder.decode(img, 'dwtDct')
        print('Number of bits match')
        match = self.watermark_bits == watermark
        print(match)
        print(np.sum(match))
        return np.sum(match)/len(match)
        return watermark

water_maker_bits = WaterMaker(0b101100111110110010010000011110111011000110011110)

pipeline_text2image = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda")

prompt = "photo of dog"
for i in range(5):
    image = pipeline_text2image(prompt=prompt).images[0]
    watermark_dec = water_maker_bits.decode(np.array(image))

Almost all images generated have 48/48 bits match

Using Hosted Inference API

import requests

API_URL = "https://api-inference.huggingface.co/models/stabilityai/stable-diffusion-xl-base-1.0"
headers = {"Authorization": f"Bearer {token}"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.content

image_bytes = query({
"inputs": "photo of dog 1",
})

for i in range(5):
    image_bytes = query({
    "inputs": f"photo of dog {i}",
    })
    image = Image.open(io.BytesIO(image_bytes))
    watermark_dec = water_maker_bits.decode(np.array(image))

Almost all images have bits match only 21/48

Sign up or log in to comment