Upload 3 files

Browse files

Files changed (4) hide show

.gitattributes +2 -0
README.md +209 -0
bench.png +3 -0
example.jpg +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+bench.png filter=lfs diff=lfs merge=lfs -text
+example.jpg filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,209 @@

+---
+license: apache-2.0
+language:
+  - en
+  - zh
+library_name: diffusers
+pipeline_tag: text-to-image
+tasks:
+  - text-to-image-synthesis
+frameworks: PyTorch
+base_model:
+  - Oppo/Qwen-Image-Pruning
+base_model_relation: quantized
+---
+# Qwen-Image-Pruning-for-ComfyUI
+===================================================================================
+本模型为 https://huggingface.co/OPPOer/Qwen-Image-Pruning 模型的转换和fp8_e4m3fn量化版本,便于 ComfyUI 用户加载使用。
+This model is the converted and quantized version of the model: https://huggingface.co/OPPOer/Qwen-Image-Pruning, easy load for ComfyUI users.
+<p align="center">
+    <img src="example.jpg" width="600"/>
+<p>
+## License Agreement
+Please fall under Qwen-Image license. Qwen-Image license is: apache-2.0
+以下部分引用自原模型说明内容：
+===================================================================================
+<div align="center">
+  <h1>Qwen-Image-Pruning</h1>
+<a href='https://github.com/OPPO-Mente-Lab/Qwen-Image-Pruning'><img src="https://img.shields.io/badge/GitHub-OPPOer-blue.svg?logo=github" alt="GitHub"></a>
+</div>
+## Introduction
+This open-source project is based on Qwen-Image and has attempted model pruning, removing 20 layers while retaining the weights of 40 layers, resulting in a model size of 13.3B parameters. The pruned model has experienced a slight drop in objective metrics. The pruned version will continue to be iterated upon. Additionally, the pruned version supports the adaptation and loading of community models such as LoRA and ControlNet. Please stay tuned. For the relevant inference scripts, please refer to https://github.com/OPPO-Mente-Lab/Qwen-Image-Pruning.
+<div align="center">
+  <img src="bench.png">
+</div>
+## Quick Start
+Install the latest version of diffusers and pytorch
+```
+pip install torch
+pip install git+https://github.com/huggingface/diffusers
+```
+### 1. Qwen-Image-Pruning Inference
+```python
+import torch
+import os
+from diffusers import DiffusionPipeline
+model_name = "OPPOer/Qwen-Image-Pruning"
+if torch.cuda.is_available():
+    torch_dtype = torch.bfloat16
+    device = "cuda"
+else:
+    torch_dtype = torch.bfloat16
+    device = "cpu"
+pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
+pipe = pipe.to(device)
+# Generate image
+positive_magic = {"en": ", Ultra HD, 4K, cinematic composition.", # for english prompt,
+"zh": "，超清，4K，电影级构图。" # for chinese prompt,
+}
+negative_prompt = " "
+prompts = [
+    '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。她身后的玻璃板上手写体写着 "一、Qwen-Image的技术路线： 探索视觉生成基础模型的极限，开创理解与生成一体化的未来。二、Qwen-Image的模型特色：1、复杂文字渲染。支持中英渲染、自动布局； 2、精准图像编辑。支持文字编辑、物体增减、风格变换。三、Qwen-Image的未来愿景：赋能专业内容创作、助力生成式AI发展。"',
+    '海报，温馨家庭场景，柔和阳光洒在野餐布上，色彩温暖明亮，主色调为浅黄、米白与淡绿，点缀着鲜艳的水果和野花，营造轻松愉快的氛围，画面简洁而富有层次，充满生活气息，传达家庭团聚与自然和谐的主题。文字内容：“共享阳光，共享爱。全家一起野餐，享受美好时光。让每一刻都充满欢笑与温暖。”',
+    '一个穿着校服的年轻女孩站在教室里，在黑板上写字。黑板中央用整洁的白粉笔写着“Introducing Qwen-Image, a foundational image generation model that excels in complex text rendering and precise image editing”。柔和的自然光线透过窗户，投下温柔的阴影。场景以写实的摄影风格呈现，细节精细，景深浅，色调温暖。女孩专注的表情和空气中的粉笔灰增添了动感。背景元素包括课桌和教育海报，略微模糊以突出中心动作。超精细32K分辨率，单反质量，柔和的散景效果，纪录片式的构图。',
+    '一个台球桌上放着两排台球，每排5个，第一行的台球上面分别写着"Qwen""Image" "将 "于" "8" ，第二排台球上面分别写着"月" "正" "式" "发" "布" 。',
+]
+output_dir = 'examples_Pruning'
+os.makedirs(output_dir, exist_ok=True)
+for prompt in prompts:
+    output_img_path = f"{output_dir}/{prompt[:80]}.png"
+    image = pipe(
+        prompt=prompt + positive_magic['zh'],
+        negative_prompt=negative_prompt,
+        width=1328,
+        height=1328,
+        num_inference_steps=8,
+        true_cfg_scale=1,
+        generator=torch.Generator(device="cuda").manual_seed(42)
+    ).images[0]
+    image.save(output_img_path)
+```
+### 2. Qwen-Image-Pruning & Realism-LoRA Inference
+```python
+import torch
+import os
+from diffusers import DiffusionPipeline
+model_name = "OPPOer/Qwen-Image-Pruning"
+lora_name = 'flymy_realism.safetensors'
+if torch.cuda.is_available():
+    torch_dtype = torch.bfloat16
+    device = "cuda"
+else:
+    torch_dtype = torch.bfloat16
+    device = "cpu"
+pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
+pipe = pipe.to(device)
+pipe.load_lora_weights(lora_name, adapter_name="lora")
+# Generate image
+positive_magic = {"en": ", Ultra HD, 4K, cinematic composition.", # for english prompt,
+"zh": "，超清，4K，电影级构图。" # for chinese prompt,
+}
+negative_prompt = " "
+prompts = [
+    '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。她身后的玻璃板上手写体写着 "一、Qwen-Image的技术路线： 探索视觉生成基础模型的极限，开创理解与生成一体化的未来。二、Qwen-Image的模型特色：1、复杂文字渲染。支持中英渲染、自动布局； 2、精准图像编辑。支持文字编辑、物体增减、风格变换。三、Qwen-Image的未来愿景：赋能专业内容创作、助力生成式AI发展。"',
+    '海报，温馨家庭场景，柔和阳光洒在野餐布上，色彩温暖明亮，主色调为浅黄、米白与淡绿，点缀着鲜艳的水果和野花，营造轻松愉快的氛围，画面简洁而富有层次，充满生活气息，传达家庭团聚与自然和谐的主题。文字内容：“共享阳光，共享爱。全家一起野餐，享受美好时光。让每一刻都充满欢笑与温暖。”',
+    '一个穿着校服的年轻女孩站在教室里，在黑板上写字。黑板中央用整洁的白粉笔写着“Introducing Qwen-Image, a foundational image generation model that excels in complex text rendering and precise image editing”。柔和的自然光线透过窗户，投下温柔的阴影。场景以写实的摄影风格呈现，细节精细，景深浅，色调温暖。女孩专注的表情和空气中的粉笔灰增添了动感。背景元素包括课桌和教育海报，略微模糊以突出中心动作。超精细32K分辨率，单反质量，柔和的散景效果，纪录片式的构图。',
+    '一个台球桌上放着两排台球，每排5个，第一行的台球上面分别写着"Qwen""Image" "将 "于" "8" ，第二排台球上面分别写着"月" "正" "式" "发" "布" 。',
+]
+output_dir = 'examples_Pruning+Realism_LoRA'
+os.makedirs(output_dir, exist_ok=True)
+for prompt in prompts:
+    output_img_path = f"{output_dir}/{prompt[:80]}.png"
+    image = pipe(
+        prompt=prompt + positive_magic['zh'],
+        negative_prompt=negative_prompt,
+        width=1328,
+        height=1328,
+        num_inference_steps=8,
+        true_cfg_scale=1,
+        generator=torch.Generator(device="cuda").manual_seed(42)
+    ).images[0]
+    image.save(output_img_path)
+```
+### 3. Qwen-Image-Pruning & ControlNet Inference
+```python
+import os
+import glob
+import torch
+from diffusers import DiffusionPipeline
+from diffusers.utils import load_image
+from diffusers import QwenImageControlNetPipeline, QwenImageControlNetModel
+model_name = "OPPOer/Qwen-Image-Pruning"
+controlnet_name = "InstantX/Qwen-Image-ControlNet-Union"
+# Load the pipeline
+if torch.cuda.is_available():
+    torch_dtype = torch.bfloat16
+    device = "cuda"
+else:
+    torch_dtype = torch.bfloat16
+    device = "cpu"
+controlnet = QwenImageControlNetModel.from_pretrained(controlnet_name, torch_dtype=torch.bfloat16)
+pipe = QwenImageControlNetPipeline.from_pretrained(
+    model_name, controlnet=controlnet, torch_dtype=torch.bfloat16
+)
+pipe = pipe.to(device)
+# Generate image
+prompt_dict = {
+    "soft_edge.png": "Photograph of a young man with light brown hair jumping mid-air off a large, reddish-brown rock. He's wearing a navy blue sweater, light blue shirt, gray pants, and brown shoes. His arms are outstretched, and he has a slight smile on his face. The background features a cloudy sky and a distant, leafless tree line. The grass around the rock is patchy.",
+    "canny.png": "Aesthetics art, traditional asian pagoda, elaborate golden accents, sky blue and white color palette, swirling cloud pattern, digital illustration, east asian architecture, ornamental rooftop, intricate detailing on building, cultural representation.",
+    "depth.png": "A swanky, minimalist living room with a huge floor-to-ceiling window letting in loads of natural light. A beige couch with white cushions sits on a wooden floor, with a matching coffee table in front. The walls are a soft, warm beige, decorated with two framed botanical prints. A potted plant chills in the corner near the window. Sunlight pours through the leaves outside, casting cool shadows on the floor.",
+    "pose.png": "Photograph of a young man with light brown hair and a beard, wearing a beige flat cap, black leather jacket, gray shirt, brown pants, and white sneakers. He's sitting on a concrete ledge in front of a large circular window, with a cityscape reflected in the glass. The wall is cream-colored, and the sky is clear blue. His shadow is cast on the wall.",
+}
+controlnet_conditioning_scale = 1.0
+output_dir = f'examples_Pruning+ControlNet'
+os.makedirs(output_dir, exist_ok=True)
+for path in glob.glob('conds/*'):
+    control_image = load_image(path)
+    image_name = path.split('/')[-1]
+    if image_name in prompt_dict:
+        image = pipe(
+            prompt=prompt_dict[image_name],
+            negative_prompt=" ",
+            control_image=control_image,
+            controlnet_conditioning_scale=controlnet_conditioning_scale,
+            width=control_image.size[0],
+            height=control_image.size[1],
+            num_inference_steps=8,
+            true_cfg_scale=4.0,
+            generator=torch.Generator(device="cuda").manual_seed(42),
+        ).images[0]
+        image.save(os.path.join(output_dir, image_name))
+```

bench.png ADDED Viewed

Git LFS Details

SHA256: 78823f445b259bf42ae5acd5c4c6c076c5a98deca9e864d2882057dadc9860a2
Pointer size: 132 Bytes
Size of remote file: 2.43 MB

example.jpg ADDED Viewed

Git LFS Details

SHA256: 4223b5570843c543055bda1c585d764e98b2fdbdd48c1f9a8af5e318fde346b1
Pointer size: 131 Bytes
Size of remote file: 735 kB