wikeeyang commited on
Commit
8f7553f
·
verified ·
1 Parent(s): fcf1717

Upload 3 files

Browse files
Files changed (4) hide show
  1. .gitattributes +2 -0
  2. README.md +209 -0
  3. bench.png +3 -0
  4. example.jpg +3 -0
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ bench.png filter=lfs diff=lfs merge=lfs -text
37
+ example.jpg filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,209 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ library_name: diffusers
7
+ pipeline_tag: text-to-image
8
+ tasks:
9
+ - text-to-image-synthesis
10
+ frameworks: PyTorch
11
+ base_model:
12
+ - Oppo/Qwen-Image-Pruning
13
+ base_model_relation: quantized
14
+ ---
15
+ # Qwen-Image-Pruning-for-ComfyUI
16
+ ===================================================================================
17
+
18
+ 本模型为 https://huggingface.co/OPPOer/Qwen-Image-Pruning 模型的转换和fp8_e4m3fn量化版本,便于 ComfyUI 用户加载使用。
19
+
20
+ This model is the converted and quantized version of the model: https://huggingface.co/OPPOer/Qwen-Image-Pruning, easy load for ComfyUI users.
21
+
22
+ <p align="center">
23
+ <img src="example.jpg" width="600"/>
24
+ <p>
25
+
26
+ ## License Agreement
27
+
28
+ Please fall under Qwen-Image license. Qwen-Image license is: apache-2.0
29
+
30
+
31
+ 以下部分引用自原模型说明内容:
32
+
33
+ ===================================================================================
34
+
35
+
36
+ <div align="center">
37
+ <h1>Qwen-Image-Pruning</h1>
38
+ <a href='https://github.com/OPPO-Mente-Lab/Qwen-Image-Pruning'><img src="https://img.shields.io/badge/GitHub-OPPOer-blue.svg?logo=github" alt="GitHub"></a>
39
+ </div>
40
+
41
+ ## Introduction
42
+ This open-source project is based on Qwen-Image and has attempted model pruning, removing 20 layers while retaining the weights of 40 layers, resulting in a model size of 13.3B parameters. The pruned model has experienced a slight drop in objective metrics. The pruned version will continue to be iterated upon. Additionally, the pruned version supports the adaptation and loading of community models such as LoRA and ControlNet. Please stay tuned. For the relevant inference scripts, please refer to https://github.com/OPPO-Mente-Lab/Qwen-Image-Pruning.
43
+
44
+ <div align="center">
45
+ <img src="bench.png">
46
+ </div>
47
+
48
+ ## Quick Start
49
+
50
+ Install the latest version of diffusers and pytorch
51
+ ```
52
+ pip install torch
53
+ pip install git+https://github.com/huggingface/diffusers
54
+ ```
55
+
56
+ ### 1. Qwen-Image-Pruning Inference
57
+ ```python
58
+ import torch
59
+ import os
60
+ from diffusers import DiffusionPipeline
61
+
62
+ model_name = "OPPOer/Qwen-Image-Pruning"
63
+
64
+ if torch.cuda.is_available():
65
+ torch_dtype = torch.bfloat16
66
+ device = "cuda"
67
+ else:
68
+ torch_dtype = torch.bfloat16
69
+ device = "cpu"
70
+
71
+ pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
72
+ pipe = pipe.to(device)
73
+
74
+ # Generate image
75
+ positive_magic = {"en": ", Ultra HD, 4K, cinematic composition.", # for english prompt,
76
+ "zh": ",超清,4K,电影级构图。" # for chinese prompt,
77
+ }
78
+ negative_prompt = " "
79
+
80
+ prompts = [
81
+ '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。她身后的玻璃板上手写体写着 "一、Qwen-Image的技术路线: 探索视觉生成基础模型的极限,开创理解与生成一体化的未来。二、Qwen-Image的模型特色:1、复杂文字渲染。支持中英渲染、自动布局; 2、精准图像编辑。支持文字编辑、物体增减、风格变换。三、Qwen-Image的未来愿景:赋能专业内容创作、助力生成式AI发展。"',
82
+ '海报,温馨家庭场景,柔和阳光洒在野餐布上,色彩温暖明亮,主色调为浅黄、米白与淡绿,点缀着鲜艳的水果和野花,营造轻松愉快的氛围,画面简洁而富有层次,充满生活气息,传达家庭团聚与自然和谐的主题。文字内容:“共享阳光,共享爱。全家一起野餐,享受美好时光。让每一刻都充满欢笑与温暖。”',
83
+ '一个穿着校服的年轻女孩站在教室里,在黑板上写字。黑板中央用整洁的白粉笔写着“Introducing Qwen-Image, a foundational image generation model that excels in complex text rendering and precise image editing”。柔和的自然光线透过窗户,投下温柔的阴影。场景以写实的摄影风格呈现,细节精细,景深浅,色调温暖。女孩专注的表情和空气中的粉笔灰增添了动感。背景元素包括课桌和教育海报,略微模糊以突出中心动作。超精细32K分辨率,单反质量,柔和的散景效果,纪录片式的构图。',
84
+ '一个台球桌上放着两排台球,每排5个,第一行的台球上面分别写着"Qwen""Image" "将 "于" "8" ,第二排台球上面分别写着"月" "正" "式" "发" "布" 。',
85
+ ]
86
+
87
+ output_dir = 'examples_Pruning'
88
+ os.makedirs(output_dir, exist_ok=True)
89
+ for prompt in prompts:
90
+ output_img_path = f"{output_dir}/{prompt[:80]}.png"
91
+ image = pipe(
92
+ prompt=prompt + positive_magic['zh'],
93
+ negative_prompt=negative_prompt,
94
+ width=1328,
95
+ height=1328,
96
+ num_inference_steps=8,
97
+ true_cfg_scale=1,
98
+ generator=torch.Generator(device="cuda").manual_seed(42)
99
+ ).images[0]
100
+ image.save(output_img_path)
101
+ ```
102
+
103
+ ### 2. Qwen-Image-Pruning & Realism-LoRA Inference
104
+ ```python
105
+ import torch
106
+ import os
107
+ from diffusers import DiffusionPipeline
108
+
109
+ model_name = "OPPOer/Qwen-Image-Pruning"
110
+ lora_name = 'flymy_realism.safetensors'
111
+
112
+ if torch.cuda.is_available():
113
+ torch_dtype = torch.bfloat16
114
+ device = "cuda"
115
+ else:
116
+ torch_dtype = torch.bfloat16
117
+ device = "cpu"
118
+
119
+ pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
120
+ pipe = pipe.to(device)
121
+ pipe.load_lora_weights(lora_name, adapter_name="lora")
122
+
123
+ # Generate image
124
+ positive_magic = {"en": ", Ultra HD, 4K, cinematic composition.", # for english prompt,
125
+ "zh": ",超清,4K,电影级构图。" # for chinese prompt,
126
+ }
127
+ negative_prompt = " "
128
+
129
+ prompts = [
130
+ '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。她身后的玻璃板上手写体写着 "一、Qwen-Image的技术路线: 探索视觉生成基础模型的极限,开创理解与生成一体化的未来。二、Qwen-Image的模型特色:1、复杂文字渲染。支持中英渲染、自动布局; 2、精准图像编辑。支持文字编辑、物体增减、风格变换。三、Qwen-Image的未来愿景:赋能专业内容创作、助力生成式AI发展。"',
131
+ '海报,温馨家庭场景,柔和阳光洒在野餐布上,色彩温暖明亮,主色调为浅黄、米白与淡绿,点缀着鲜艳的水果和野花,营造轻松愉快的氛围,画面简洁而富有层次,充满生活气息,传达家庭团聚与自然和谐的主题。文字内容:“共享阳光,共享爱。全家一起野餐,享受美好时光。让每一刻都充满欢笑与温暖。”',
132
+ '一个穿着校服的年轻女孩站在教室里,在黑板上写字。黑板中央用整洁的白粉笔写着“Introducing Qwen-Image, a foundational image generation model that excels in complex text rendering and precise image editing”。柔和的自然光线透过窗户,投下温柔的阴影。场景以写实的摄影风格呈现,细节精细,景深浅,色调温暖。女孩专注的表情和空气中的粉笔灰增添了动感。背景元素包括课桌和教育海报,略微模糊以突出中心动作。超精细32K分辨率,单反质量,柔和的散景效果,纪录片式的构图。',
133
+ '一个台球桌上放着两排台球,每排5个,第一行的台球上面分别写着"Qwen""Image" "将 "于" "8" ,第二排台球上面分别写着"月" "正" "式" "发" "布" 。',
134
+ ]
135
+
136
+ output_dir = 'examples_Pruning+Realism_LoRA'
137
+ os.makedirs(output_dir, exist_ok=True)
138
+ for prompt in prompts:
139
+ output_img_path = f"{output_dir}/{prompt[:80]}.png"
140
+ image = pipe(
141
+ prompt=prompt + positive_magic['zh'],
142
+ negative_prompt=negative_prompt,
143
+ width=1328,
144
+ height=1328,
145
+ num_inference_steps=8,
146
+ true_cfg_scale=1,
147
+ generator=torch.Generator(device="cuda").manual_seed(42)
148
+ ).images[0]
149
+ image.save(output_img_path)
150
+ ```
151
+
152
+ ### 3. Qwen-Image-Pruning & ControlNet Inference
153
+ ```python
154
+ import os
155
+ import glob
156
+
157
+ import torch
158
+ from diffusers import DiffusionPipeline
159
+
160
+ from diffusers.utils import load_image
161
+ from diffusers import QwenImageControlNetPipeline, QwenImageControlNetModel
162
+
163
+ model_name = "OPPOer/Qwen-Image-Pruning"
164
+ controlnet_name = "InstantX/Qwen-Image-ControlNet-Union"
165
+
166
+ # Load the pipeline
167
+ if torch.cuda.is_available():
168
+ torch_dtype = torch.bfloat16
169
+ device = "cuda"
170
+ else:
171
+ torch_dtype = torch.bfloat16
172
+ device = "cpu"
173
+
174
+ controlnet = QwenImageControlNetModel.from_pretrained(controlnet_name, torch_dtype=torch.bfloat16)
175
+
176
+ pipe = QwenImageControlNetPipeline.from_pretrained(
177
+ model_name, controlnet=controlnet, torch_dtype=torch.bfloat16
178
+ )
179
+ pipe = pipe.to(device)
180
+
181
+ # Generate image
182
+ prompt_dict = {
183
+ "soft_edge.png": "Photograph of a young man with light brown hair jumping mid-air off a large, reddish-brown rock. He's wearing a navy blue sweater, light blue shirt, gray pants, and brown shoes. His arms are outstretched, and he has a slight smile on his face. The background features a cloudy sky and a distant, leafless tree line. The grass around the rock is patchy.",
184
+ "canny.png": "Aesthetics art, traditional asian pagoda, elaborate golden accents, sky blue and white color palette, swirling cloud pattern, digital illustration, east asian architecture, ornamental rooftop, intricate detailing on building, cultural representation.",
185
+ "depth.png": "A swanky, minimalist living room with a huge floor-to-ceiling window letting in loads of natural light. A beige couch with white cushions sits on a wooden floor, with a matching coffee table in front. The walls are a soft, warm beige, decorated with two framed botanical prints. A potted plant chills in the corner near the window. Sunlight pours through the leaves outside, casting cool shadows on the floor.",
186
+ "pose.png": "Photograph of a young man with light brown hair and a beard, wearing a beige flat cap, black leather jacket, gray shirt, brown pants, and white sneakers. He's sitting on a concrete ledge in front of a large circular window, with a cityscape reflected in the glass. The wall is cream-colored, and the sky is clear blue. His shadow is cast on the wall.",
187
+ }
188
+ controlnet_conditioning_scale = 1.0
189
+
190
+ output_dir = f'examples_Pruning+ControlNet'
191
+ os.makedirs(output_dir, exist_ok=True)
192
+
193
+ for path in glob.glob('conds/*'):
194
+ control_image = load_image(path)
195
+ image_name = path.split('/')[-1]
196
+ if image_name in prompt_dict:
197
+ image = pipe(
198
+ prompt=prompt_dict[image_name],
199
+ negative_prompt=" ",
200
+ control_image=control_image,
201
+ controlnet_conditioning_scale=controlnet_conditioning_scale,
202
+ width=control_image.size[0],
203
+ height=control_image.size[1],
204
+ num_inference_steps=8,
205
+ true_cfg_scale=4.0,
206
+ generator=torch.Generator(device="cuda").manual_seed(42),
207
+ ).images[0]
208
+ image.save(os.path.join(output_dir, image_name))
209
+ ```
bench.png ADDED

Git LFS Details

  • SHA256: 78823f445b259bf42ae5acd5c4c6c076c5a98deca9e864d2882057dadc9860a2
  • Pointer size: 132 Bytes
  • Size of remote file: 2.43 MB
example.jpg ADDED

Git LFS Details

  • SHA256: 4223b5570843c543055bda1c585d764e98b2fdbdd48c1f9a8af5e318fde346b1
  • Pointer size: 131 Bytes
  • Size of remote file: 735 kB