File size: 2,115 Bytes
d2dcd50 4b25ce7 d2dcd50 4b25ce7 d2dcd50 bf3a8d8 d2dcd50 a582b44 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
---
pipeline_tag: image-feature-extraction
license: mit
library_name: diffusion-single-file
---
# CleanDIFT Model Card
Diffusion models learn powerful world representations that have proven valuable for tasks like semantic correspondence detection,
depth estimation, semantic segmentation, and classification.
However, diffusion models require noisy input images, which destroys information and introduces the noise level as a hyperparameter that needs to be tuned for each task.
We introduce CleanDIFT, a novel method to extract noise-free, timestep-independent features by enabling diffusion models to work directly with clean input images.
The approach is efficient, training on a single GPU in just 30 minutes. We publish these models alongside our paper ["CleanDIFT: Diffusion Features without Noise"](https://compvis.github.io/cleandift/).
We provide checkpoints for Stable Diffusion 1.5 and Stable Diffusion 2.1.
## Usage
For detailed examples on how to extract features with CleanDIFT and how to use them for downstream tasks, please refer to the notebooks provided [here](https://github.com/CompVis/CleanDIFT/tree/main/notebooks).
Our checkpoints are fully compatible with the `diffusers` library.
If you already have a pipeline using SD 1.5 or SD 2.1 from `diffusers`, you can simply replace the U-Net state dict:
```python
from diffusers import UNet2DConditionModel
from huggingface_hub import hf_hub_download
unet = UNet2DConditionModel.from_pretrained("stabilityai/stable-diffusion-2-1", subfolder="unet")
ckpt_pth = hf_hub_download(repo_id="CompVis/cleandift", filename="cleandift_sd21_unet.safetensors")
state_dict = load_file(ckpt_pth)
unet.load_state_dict(state_dict, strict=True)
```
## Citation
```bibtex
@misc{stracke2024cleandiftdiffusionfeaturesnoise,
title={CleanDIFT: Diffusion Features without Noise},
author={Nick Stracke and Stefan Andreas Baumann and Kolja Bauer and Frank Fundel and Björn Ommer},
year={2024},
eprint={2412.03439},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.03439},
}
``` |