Is there a way to remove the NSFW filter?

#25
by Devilmoon - opened

Given that some of the acceptable goals of using this model include

Safe deployment of models which have the potential to generate harmful content.
Probing and understanding the limitations and biases of generative models.

Is there a way to remove the NSFW filter which blocks any image the model itself deems not safe?
How is it possible to probe the model or understand how to prevent harmful content if the model itself blocks out random images without giving me the option to understand what's going on under the hood?

It's very possible - it's literally the equivalent of removing a couple lines of code.

EDIT: I'm being deliberately cryptic here, because the authors have clearly chosen to implement a safety feature. The beauty of open source is that anyone can take a monkeywrench to the code and get it to behave in whatever manner they desire. But seeing as this discussion is on the authors' platform, then I believe the discussions must respect their desire to create safe content.

you need to modify safety_checker.py in the stable_diffusion pipeline. it is literally one line of code. I don't know how the safety_checker analyse the picture but it is definitely too strict. I often have had a warning about NSFW content even though my prompts were mostly about geometrics, fractals etc. definitely PG-friendly.

I'm getting "Potential NSFW content was detected in one or more images. A black image will be returned instead. Try again with a different prompt and/or seed." even using the example prompt: "a photograph of an astronaut riding a horse"

I think it would be good also to have details about the output of the CLIP safety_checker :

[{'special_scores': {0: -0.072, 1: -0.053, 2: -0.085}, 'special_care': [], 'concept_scores': {0: -0.059, 1: -0.042, 2: -0.054, 3: -0.064, 4: -0.05, 5: -0.035, 6: -0.045, 7: -0.057, 8: -0.038, 9: -0.091, 10: -0.033, 11: -0.034, 12: -0.048, 13: -0.075, 14: -0.097, 15: -0.078, 16: -0.098}, 'bad_concepts': []}, {'special_scores': {0: -0.067, 1: -0.046, 2: -0.082}, 'special_care': [], 'concept_scores': {0: -0.057, 1: -0.024, 2: -0.044, 3: -0.048, 4: -0.037, 5: -0.023, 6: -0.026, 7: -0.041, 8: -0.036, 9: -0.073, 10: -0.04, 11: -0.033, 12: -0.032, 13: -0.062, 14: -0.084, 15: -0.06, 16: -0.08}, 'bad_concepts': []}]

and a way to control from the pipeline the threshold to have a stronger/weaker nsfw filter, with a minimum admitted value to ensure to avoid very bad concepts.

This comment has been hidden

I removed the for loop from the safety_checker.py file, which was checking for NSFW content and returning a black image. This worked for me as it showed the NSFW warning for almost all the prompts I was trying. E.g., "An image of the sun wearing sunglasses." I commented out the code as shown below.

Screenshot 2022-12-25 at 5.43.14 PM.png

I removed the for loop from the safety_checker.py file, which was checking for NSFW content and returning a black image. This worked for me as it showed the NSFW warning for almost all the prompts I was trying. E.g., "An image of the sun wearing sunglasses." I commented out the code as shown below.

Screenshot 2022-12-25 at 5.43.14 PM.png

How were you able to modify the repo? I am trying to use the Inference Endpoint but see no way of modifying the files.

Thanks for the help.

in safety_checker.pyc found this part of text.
Which text do I hev to remove to get rid of th NSFW remark

G s  z8StableDiffusionSafetyChecker.forward..z‘Potential NSFW content was detected in one or more images. A black image will be returned instead. Try again with a different prompt and/or seed.)r r r r# ÚcpuÚfloatÚnumpyr" ÚshapeÚranger0 r% ÚitemÚroundÚappendr$ Ú enumerateÚnpÚzerosÚanyÚloggerÚ warning)r& Ú
clip_inputÚimagesÚ
pooled_outputr

Sign up or log in to comment