metadata

license: cc-by-4.0
language:
  - en
tags:
  - art
  - text-to-image
  - stable-diffusion-diffusers
  - unlearned-diffusion-model
  - safe-diffusion-model
  - unlearned-text-encoder
  - defensive-unlearning

Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models

Project Website | Arxiv Preprint | Fine-tuned Weights | Demo

Our proposed robust unlearning framework, AdvUnlearn, enhances diffusion models' safety by robustly erasing unwanted concepts through adversarial training, achieving an optimal balance between concept erasure and image generation quality.

Baselines

DM Unlearning Methods	Nudity	Van Gogh	Objects
ESD (Erased Stable Diffusion)	✅	✅	✅
FMN (Forget-Me-Not)	✅	✅	✅
AC (Ablating Concepts)	❌	✅	❌
UCE (Unified Concept Editing)	✅	✅	❌
SalUn (Saliency Unlearning)	✅	❌	✅
SH (ScissorHands)	✅	❌	✅
ED (EraseDiff)	✅	❌	✅
SPM (concept-SemiPermeable Membrane)	✅	✅	✅
AdvUnlearn (Ours)	✅	✅	✅

Cite Our Work

The preprint can be cited as follows:

@misc{zhang2024defensive,
      title={Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models}, 
      author={Yimeng Zhang and Xin Chen and Jinghan Jia and Yihua Zhang and Chongyu Fan and Jiancheng Liu and Mingyi Hong and Ke Ding and Sijia Liu},
      year={2024},
      eprint={2405.15234},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}