File size: 6,282 Bytes
aa4eedf 34398fc aa4eedf 34398fc aa4eedf 2fe52f3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
---
title: Ortha
emoji: 🖼
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 4.26.0
app_file: app.py
pinned: false
license: apache-2.0
---
# Orthogonal Adaptation
## 🔧 Dependencies and Installation
- Python >= 3.9 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html))
- Diffusers==0.19.3
- XFormer (is recommend to save memory)
## ⏬ Pretrained Model and Data Preparation
### Pretrained Model Preparation
We adopt the [ChilloutMix](https://civitai.com/models/6424/chilloutmix) fine-tuned model for generating human subjects.
```bash
git clone https://github.com/TencentARC/Mix-of-Show.git
cd experiments/pretrained_models
# Diffusers-version ChilloutMix
git-lfs clone https://huggingface.co/windwhinny/chilloutmix.git
```
<!-- ### Data Preparation
Note: Data selection and tagging are important in single-concept tuning. We strongly recommend checking the data processing in [sd-scripts](https://github.com/kohya-ss/sd-scripts). **In our ED-LoRA, we do not require any regularization dataset.** The detailed dataset preparation steps can refer to [Dataset.md](docs/Dataset.md). Our preprocessed data used in this repo is available at [Google Drive](https://drive.google.com/file/d/1O5oev8861N_KmKtqefb45l3SiSblbo5O/view?usp=sharing). -->
## 🕹️ Single-Client Concept Tuning
### Step 0: Data selection and Tagging for a single concept
Data selection and tagging are crucial in single-concept tuning. We strongly recommend checking the data processing in [sd-scripts](https://github.com/kohya-ss/sd-scripts). **In our ED-LoRA, we do not require any regularization dataset.**
1. **Collect Images**: Gather 5-10 images of the concept you want to customize and place them inside a single folder located at `/single-concept/data/yourconceptname/image`. Ensure the images are consistent but also varied in appearance to prevent overfitting.
2. **Create Captions**: Write captions for each image you collected. Save these captions as text files in the `/single-concept/data/yourconceptname/caption` directory.
3. **Generate Masks**: To further improve the understanding of the concept, save masks of each image in the `/single-concept/data/yourconceptname/mask` directory. Use the `data_processing.ipynb` notebook for this step.
4. **Create Data Configs**: In the `/single-concept/data_configs` directory, create a JSON file that summarizes the files you just created. The file name could be `yourconceptname.json`.
### Step 1: Modify the Config at `/single-concept/train_configs`
Before tuning, it is essential to specify the data paths and adjust certain hyperparameters in the corresponding config file. Below are some basic config settings to be modified:
- **Concept List**: The data config that you just created should be referenced from 'concept_list'.
- **Validation Prompt**: You might want to set a proper validation prompt in `single-concept/validation_prompts` to visualize the single-concept sampling.
```yaml
manual_seed: 1234 # this seed determines choice of columns from orthogonal basis (set differently for each concept)
datasets:
train:
# Concept data config
concept_list: single-concept/data_configs/hina_amano.json
replace_mapping:
<TOK>: <hina1> <hina2> # concept new token
val_vis:
# Validation prompt for visualization during tuning
prompts: single-concept/validation_prompts/characters/test_girl.txt
replace_mapping:
<TOK>: <hina1> <hina2> # Concept new token
models:
enable_edlora: true # true means ED-LoRA, false means vallina LoRA
new_concept_token: <hina1>+<hina2> # Concept new token, use "+" to connect
initializer_token: <rand-0.013>+girl
# Init token, only need to revise the later one based on the semantic category of given concept
val:
val_during_save: true # When saving checkpoint, visualize sample results.
compose_visualize: true # Compose all samples into a large grid figure for visualization
```
### Step 2: Start Tuning
We tune each concept with 2 A100 GPU. Similar to LoRA, community user can enable gradient accumulation, xformer, gradient checkpoint for tuning on one GPU.
```bash
accelerate launch train_edlora.py -opt single-concept/0005_lebron_ortho.yml
```
The LoRA weights for the single concept are saved inside the `/experiments/single-concept` folder under your concept name folder
### Step 3: Single-concept Sampling
To sample an image from your trained weights from the last step, specify the model path in the sample config (located in `/single-concept/sample_configs`) and run the following command:
```bash
python test_edlora.py -opt single-concept/sample_configs/8101_EDLoRA_potter_Cmix_B4_Repeat500.yml
```
## 🕹️ Merging LoRAs
### Step 1: Collect Concept Models
Collect all concept models you want to extend the pretrained model and modify the config in `/multi-concept/merge_configs` accordingly.
```yaml
[
{
"lora_path": "experiments/0022_elsa_ortho/models/edlora_model-latest.pth",
"unet_alpha": 1.8,
"text_encoder_alpha": 1.8,
"concept_name": "<elsa1> <elsa2>"
},
{
"lora_path": "experiments/0023_moana_ortho/models/edlora_model-latest.pth",
"unet_alpha": 1.8,
"text_encoder_alpha": 1.8,
"concept_name": "<moana1> <moana2>"
}
... # keep adding new concepts for extending the pretrained models
]
```
### Step 2: Weight Fusion
Specify which merge config you are using inside the `fuse.sh` file, and then run:
```bash
bash fuse.sh
```
The merged weights are now saved in the `/experiments/multi-concept` directory. This process is almost instant.
### Step 3: Sample
**Regionally controllable multi-concept sampling:**
We utilize regionally controllable sampling from Mix-of-Show to enable multi-concept generation. Adding openpose conditioning greatly increases the reliability of generations.
Define which fused model inside `/experiments/multi-concept` you are going to use, and specify the keypose condition in `/multi-concept/pose_data` if needed. Also, modify the context prompts and regional prompts. Then run:
```bash
bash regionally_sample.sh
```
The samples from the multi-concept generation will now be stored in the `/results` folder.
|