ortha / README.md
ujin-song's picture
Update README.md
2fe52f3 verified
|
raw
history blame
6.28 kB
metadata
title: Ortha
emoji: 🖼
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 4.26.0
app_file: app.py
pinned: false
license: apache-2.0

Orthogonal Adaptation

🔧 Dependencies and Installation

  • Python >= 3.9 (Recommend to use Anaconda or Miniconda)
  • Diffusers==0.19.3
  • XFormer (is recommend to save memory)

⏬ Pretrained Model and Data Preparation

Pretrained Model Preparation

We adopt the ChilloutMix fine-tuned model for generating human subjects.

git clone https://github.com/TencentARC/Mix-of-Show.git

cd experiments/pretrained_models

# Diffusers-version ChilloutMix
git-lfs clone https://huggingface.co/windwhinny/chilloutmix.git

🕹️ Single-Client Concept Tuning

Step 0: Data selection and Tagging for a single concept

Data selection and tagging are crucial in single-concept tuning. We strongly recommend checking the data processing in sd-scripts. In our ED-LoRA, we do not require any regularization dataset.

  1. Collect Images: Gather 5-10 images of the concept you want to customize and place them inside a single folder located at /single-concept/data/yourconceptname/image. Ensure the images are consistent but also varied in appearance to prevent overfitting.

  2. Create Captions: Write captions for each image you collected. Save these captions as text files in the /single-concept/data/yourconceptname/caption directory.

  3. Generate Masks: To further improve the understanding of the concept, save masks of each image in the /single-concept/data/yourconceptname/mask directory. Use the data_processing.ipynb notebook for this step.

  4. Create Data Configs: In the /single-concept/data_configs directory, create a JSON file that summarizes the files you just created. The file name could be yourconceptname.json.

Step 1: Modify the Config at /single-concept/train_configs

Before tuning, it is essential to specify the data paths and adjust certain hyperparameters in the corresponding config file. Below are some basic config settings to be modified:

  • Concept List: The data config that you just created should be referenced from 'concept_list'.
  • Validation Prompt: You might want to set a proper validation prompt in single-concept/validation_prompts to visualize the single-concept sampling.
manual_seed: 1234 # this seed determines choice of columns from orthogonal basis (set differently for each concept)

datasets:
  train:
    # Concept data config
    concept_list: single-concept/data_configs/hina_amano.json
    replace_mapping:
      <TOK>: <hina1> <hina2> # concept new token
  val_vis:
    # Validation prompt for visualization during tuning
    prompts: single-concept/validation_prompts/characters/test_girl.txt
    replace_mapping:
      <TOK>: <hina1> <hina2> # Concept new token

models:
  enable_edlora: true  # true means ED-LoRA, false means vallina LoRA
  new_concept_token: <hina1>+<hina2> # Concept new token, use "+" to connect
  initializer_token: <rand-0.013>+girl
  # Init token, only need to revise the later one based on the semantic category of given concept

val:
  val_during_save: true # When saving checkpoint, visualize sample results.
  compose_visualize: true # Compose all samples into a large grid figure for visualization

Step 2: Start Tuning

We tune each concept with 2 A100 GPU. Similar to LoRA, community user can enable gradient accumulation, xformer, gradient checkpoint for tuning on one GPU.

accelerate launch train_edlora.py -opt single-concept/0005_lebron_ortho.yml

The LoRA weights for the single concept are saved inside the /experiments/single-concept folder under your concept name folder

Step 3: Single-concept Sampling

To sample an image from your trained weights from the last step, specify the model path in the sample config (located in /single-concept/sample_configs) and run the following command:

python test_edlora.py -opt single-concept/sample_configs/8101_EDLoRA_potter_Cmix_B4_Repeat500.yml

🕹️ Merging LoRAs

Step 1: Collect Concept Models

Collect all concept models you want to extend the pretrained model and modify the config in /multi-concept/merge_configs accordingly.

[
    {
        "lora_path": "experiments/0022_elsa_ortho/models/edlora_model-latest.pth",
        "unet_alpha": 1.8,
        "text_encoder_alpha": 1.8,
        "concept_name": "<elsa1> <elsa2>"
    },
    {
        "lora_path": "experiments/0023_moana_ortho/models/edlora_model-latest.pth",
        "unet_alpha": 1.8,
        "text_encoder_alpha": 1.8,
        "concept_name": "<moana1> <moana2>"
    }
    ... # keep adding new concepts for extending the pretrained models
]

Step 2: Weight Fusion

Specify which merge config you are using inside the fuse.sh file, and then run:

bash fuse.sh

The merged weights are now saved in the /experiments/multi-concept directory. This process is almost instant.

Step 3: Sample

Regionally controllable multi-concept sampling:

We utilize regionally controllable sampling from Mix-of-Show to enable multi-concept generation. Adding openpose conditioning greatly increases the reliability of generations.

Define which fused model inside /experiments/multi-concept you are going to use, and specify the keypose condition in /multi-concept/pose_data if needed. Also, modify the context prompts and regional prompts. Then run:

bash regionally_sample.sh

The samples from the multi-concept generation will now be stored in the /results folder.