weights2weights / README.md
multimodalart's picture
Update README.md
f838bd8 verified
|
raw
history blame
4.79 kB
metadata
title: W2W Demo
emoji: 🏋️
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 4.37.2
app_file: app.py
pinned: false

Interpreting the Weight Space of Customized Diffusion Models

[paper] [project page]

Official implementation of the paper "Interpreting the Weight Space of Customized Diffusion Models."

teaser

We investigate the space of weights spanned by a large collection of customized diffusion models. We populate this space by creating a dataset of over 60,000 models, each of which is fine-tuned to insert a different person’s visual identity. Next, we model the underlying manifold of these weights as a subspace, which we term weights2weights. We demonstrate three immediate applications of this space -- sampling, editing, and inversion. First, as each point in the space corresponds to an identity, sampling a set of weights from it results in a model encoding a novel identity. Next, we find linear directions in this space corresponding to semantic edits of the identity (e.g., adding a beard). These edits persist in appearance across generated samples. Finally, we show that inverting a single image into this space reconstructs a realistic identity, even if the input image is out of distribution (e.g., a painting). Our results indicate that the weight space of fine-tuned diffusion models behaves as an interpretable latent space of identities.

Setup

Environment

Our code is developed in PyTorch 2.3.0 with CUDA 12.1, torchvision=0.18.0, and python=3.12.3.

To replicate our environment, install Anaconda, and run the following commands.

$ conda env create -f w2w.yml
$ conda activate w2w

Alternatively, you can follow the setup from PEFT.

Files

The files needed to create w2w space, load models, train classifiers, etc. can be downloaded at this link. Keep the folder structure and place it into the weights2weights folder containing all the code.

The dataset of full model weights (i.e. the full Dreambooth LoRA parameters) will be released within the next week (by June 21).

Sampling

We provide an interactive notebook for sampling new identity-encoding models from w2w space in sampling/sampling.ipynb. Instructions are provided in the notebook. Once a model is sampled, you can run typical inference with various text prompts and generation seeds as with a typical personalized model.

Inversion

We provide an interactive notebook for inverting a single image into a model in w2w space in inversion/inversion_real.ipynb. Instructions are provided in the notebook. We provide another notebook that with an example of inverting an out-of-distribution identity in inversion/inversion_ood.ipynb. Assets for these notebooks are provided in inversion/images/ and you can place your own assets in there.

Additionally, we provide an example script run_inversion.sh for running the inversion in invert.py. You can run the command:

$ bash inversion/run_inversion.sh

The details on the various arguments are provided in invert.py.

Editing

We provide an interactive notebook for editing the identity encoded in a model in editing/identity_editing.ipynb. Instructions are provided in the notebook. Another notebook is provided which shows how to compose multiple attribute edits together in editing/multiple_edits.ipynb.

Loading and Saving Models

Various notebooks provide examples on how to save models either as low dimensional w2w models (represented by principal component coefficients), or as models compatible with standard LoRA such as with Diffusers pipelines. We provide a notebook in other/loading.ipynbthat demonstrates how these weights can be loaded into either format.

Acknowledgments

Our code is based on implementations from the following repos:

Citation

If you found this repository useful please consider starring ⭐ and citing:

@misc{dravid2024interpreting,
      title={Interpreting the Weight Space of Customized Diffusion Models},
      author={Amil Dravid and Yossi Gandelsman and Kuan-Chieh Wang and Rameen Abdal and Gordon Wetzstein and Alexei A. Efros and Kfir Aberman},
      year={2024},
      eprint={2406.09413}
}