File size: 5,253 Bytes
22861e8
 
 
 
 
 
 
 
 
 
 
 
 
 
0cb7eb1
 
532087c
dd76804
 
 
532087c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22861e8
 
532087c
 
 
 
 
 
1b7c23d
532087c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46d7006
d3c74f7
 
 
532087c
22861e8
 
532087c
 
 
 
 
1b7c23d
 
532087c
 
 
22861e8
 
 
532087c
 
 
 
 
 
d706ac4
 
 
 
 
 
 
 
532087c
 
 
 
 
 
0cb7eb1
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127

---
license: creativeml-openrail-m
base_model: stabilityai/stable-diffusion-xl-base-1.0
dataset: NYUAD-ComNets/Asian_Female_Profession
tags:
- stable-diffusion-xl
- stable-diffusion-xl-diffusers
- text-to-image
- diffusers
- lora
inference: true
---
    


# Model description

This model is a part of project targeting Debiasing of generative stable diffusion models.

LoRA text2image fine-tuning - NYUAD-ComNets/Asian_Female_Profession_Model

These are LoRA adaption weights for stabilityai/stable-diffusion-xl-base-1.0. The weights were fine-tuned on the NYUAD-ComNets/Asian_Female_Profession dataset. 
You can find some example images.

prompt: a photo of a {profession}, looking at the camera, closeup headshot facing forward, ultra quality, sharp focus

# How to use this model:

``` python


import torch
from compel import Compel, ReturnedEmbeddingsType
from diffusers import DiffusionPipeline

import random


negative_prompt = "cartoon, anime, 3d, painting, b&w, low quality" 


models=["NYUAD-ComNets/Asian_Female_Profession_Model","NYUAD-ComNets/Black_Female_Profession_Model","NYUAD-ComNets/White_Female_Profession_Model",
"NYUAD-ComNets/Indian_Female_Profession_Model","NYUAD-ComNets/Latino_Hispanic_Female_Profession_Model","NYUAD-ComNets/Middle_Eastern_Female_Profession_Model",
"NYUAD-ComNets/Asian_Male_Profession_Model","NYUAD-ComNets/Black_Male_Profession_Model","NYUAD-ComNets/White_Male_Profession_Model",
"NYUAD-ComNets/Indian_Male_Profession_Model","NYUAD-ComNets/Latino_Hispanic_Male_Profession_Model","NYUAD-ComNets/Middle_Eastern_Male_Profession_Model"]

adapters=["asian_female","black_female","white_female","indian_female","latino_female","middle_east_female",
"asian_male","black_male","white_male","indian_male","latino_male","middle_east_male"]

pipeline = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", variant="fp16", use_safetensors=True, torch_dtype=torch.float16).to("cuda")


for i,j in zip(models,adapters):
    pipeline.load_lora_weights(i, weight_name="pytorch_lora_weights.safetensors",adapter_name=j) 


pipeline.set_adapters(random.choice(adapters))


compel = Compel(tokenizer=[pipeline.tokenizer, pipeline.tokenizer_2] ,
                    text_encoder=[pipeline.text_encoder, pipeline.text_encoder_2],
                    returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED, 
                    requires_pooled=[False, True],truncate_long_prompts=False)

    
conditioning, pooled = compel("a photo of a doctor, looking at the camera, closeup headshot facing forward, ultra quality, sharp focus") 

negative_conditioning, negative_pooled = compel(negative_prompt)
[conditioning, negative_conditioning] = compel.pad_conditioning_tensors_to_same_length([conditioning, negative_conditioning])

image = pipeline(prompt_embeds=conditioning, negative_prompt_embeds=negative_conditioning,
                     pooled_prompt_embeds=pooled, negative_pooled_prompt_embeds=negative_pooled,
                     num_inference_steps=40).images[0]

image.save('/../../x.jpg')

```


# Examples

| | | |
|:-------------------------:|:-------------------------:|:-------------------------:|
|<img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./0.jpg"> |  <img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./180.jpg">|<img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./9.jpg">|
|<img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./11.jpg"> |  <img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./222.jpg">|<img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./71.jpg">|
|<img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./154.jpg"> |  <img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./52.jpg">|<img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./655.jpg">|
|<img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./169.jpg"> |  <img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./54.jpg">|<img width="500" alt="screen shot 2017-08-07 at 12 18 15 pm" src="./6.jpg">|




# Training data

NYUAD-ComNets/Asian_Female_Profession dataset was used to fine-tune stabilityai/stable-diffusion-xl-base-1.0

profession list =['pilot','doctor','nurse','pharmacist','dietitian','professor','teacher','mathematics scientist','computer engineer','programmer','tailor','cleaner',
'soldier','security guard','lawyer','manager','accountant','secretary','singer','journalist','youtuber','tiktoker','fashion model','chef','sushi chef']

# Configurations

LoRA for the text encoder was enabled: False.

Special VAE used for training: madebyollin/sdxl-vae-fp16-fix.



# BibTeX entry and citation info

```

@article{aldahoul2024ai,
  title={AI-generated faces free from racial and gender stereotypes},
  author={AlDahoul, Nouar and Rahwan, Talal and Zaki, Yasir},
  journal={arXiv preprint arXiv:2402.01002},
  year={2024}
}

@misc{ComNets,
      url={[https://huggingface.co/NYUAD-ComNets/Asian_Female_Profession_Model](https://huggingface.co/NYUAD-ComNets/Asian_Female_Profession_Model)},
      title={Asian_Female_Profession_Model},
      author={Nouar AlDahoul, Talal Rahwan, Yasir Zaki}
}
```