Glaze and the Effectiveness of Anti-AI Methods for Diffusion Models

Community Article Published May 15, 2024

This article will talk about two anti-AI training methods, Glaze and Nightshade, and their failure to function as intended. This will also talk about methods to bypass these 'protections', although it might not be what you want.

What do they do?

Glaze (and to a lesser extent, Nightshade) both try to manipulate the image in such a way that the diffusion model/text encoders confuse the image's concept (cat) for another concept (canoe). According to the developers of glaze, they achieve a 92% disruption rate against a model when it is trained on glazed images. Nightshade also tries this, but focuses more on the text encoder / labeling, as it tries to map one concept to another by training a small "cloak" on top of the image to distort the tagger's output (they used BLIP as their tagger of choice).

The one flaw is that both methods don't work on what everyone uses nowadays. For glaze, the supposed effects of the distortion do not show up on LoRA/finetune tests. For Nightshade, the images don't affect the LoRA in a negative manner. (sidenote: the glaze team keeps saying that the tests fail due to being LoRAs and not full fledged finetunes. this is, as far as i know, a complete fabrication, since training a LoRA and fine tuning a model should just have the same effect, but with the LoRA being more optimized at the cost of less 'learning'.)

Can they be bypassed?

Yep! The most popular and linked to example is AdverseCleaner, made by lllyasviel (one of the 3 authors on controlnet), which supposedly cleans glaze in 10 lines of code:

import numpy as np
import cv2
from cv2.ximgproc import guidedFilter
img = cv2.imread('input.png').astype(np.float32)
y = img.copy()
for _ in range(64):
    y = cv2.bilateralFilter(y, 5, 8, 8)
for _ in range(4):
    y = guidedFilter(img, y, 4, 16)
cv2.imwrite('output.png', y.clip(0, 255).astype(np.uint8))

Although lllyasviel has said that "the guidance has adversarial noise" and "[it] may be unsafe to pass it into the filter", before taking the repo down (this repo has been re-implemented many times, such as in deglazer). This method removes a lot of the artifacts, only leaving residual parts which seem to be "fit for consumption".

Difference Magnified (25x detail)
image/png image/png

Another common method to remove glaze is to pass it through a VAE, a VAE encodes an image into latents (the most popular one, the latent diffusion vae, converts a 8x8 grid of pixels into a set of latents) and can try and decode those latents back into pixels, the one downside of VAEs are that they lose detail when encoding, this is just due to there not being enough numbers to represent all possible values of those pixels. This downside is what's used in a "VAE-loop", in which we encode and then decode the image, resulting in some noticeable differences, mainly in the glazed areas, making the image "fit for consumption".

These methods all try to remove the Glaze/Nightshade artifacts. This can be done for prevention/safety of a dataset, although there is one thing we forget about when we remove these artifacts.

Noise Offset

Noise offset, as described by crosslabs's article works by adding a small non-0 number to the latent image before passing it to the diffuser. This effectively increases the most contrast possible by making the model see more light/dark colors. Glaze and Nightshade effectively add noise to the images, acting as a sort of noise offset at train time. This can explain why images generated with LoRAs trained with glazed images look better than non-glazed images.

Conclusion

Glaze and Nightshade are the two most popular methods to protect art from AI, although they're not the only ones. The one issue all "AI Art protectors'' have to face is that AI sees what humans see. Those protectors can't modify the image to make it unrecognizable, but they can't just leave the image with barely noticeable artifacts, because these can be bypassed easily.