peteromallet commited on
Commit
fe9cea6
·
verified ·
1 Parent(s): b356dd7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +106 -3
README.md CHANGED
@@ -1,3 +1,106 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - peteromallet/high-quality-midjouney-srefs
5
+ base_model:
6
+ - Qwen/Qwen-Image-Edit
7
+ tags:
8
+ - image
9
+ - editing
10
+ - lora
11
+ - scene-generation
12
+ - qwen
13
+ pipeline_tag: image-to-image
14
+ library_name: diffusers
15
+ ---
16
+
17
+ # QwenEdit InScene LoRAs (Beta)
18
+
19
+ ## Model Description
20
+
21
+ **InScene** and **InScene Annotate** are a pair of LoRA fine-tunes for QwenEdit that enhance its ability to generate images based on scene references. These models work together to provide flexible scene-based image generation with optional annotation support.
22
+
23
+ ### InScene
24
+ The main model that generates images based on scene composition and layout from a reference image. InScene is trained on pairs of different shots within the same scene, along with prompts describing the desired output. Its goal is to create entirely new shots within a scene while maintaining character consistency and scene coherence.
25
+
26
+ InScene is intentionally biased towards creating completely new shots rather than minor edits. This design choice overcomes Qwen-Image-Edit's internal bias toward making small, conservative edits, enabling more dramatic scene transformations while preserving the characters and overall scene identity.
27
+
28
+ ![inscene-samples.png](inscene-samples.png)
29
+
30
+ ### InScene Annotate
31
+ InScene Annotate is trained on images with green rectangles drawn over specific regions. The model learns to generate images showing the subject within that green rectangle area. Rather than simply zooming in precisely on the marked region, it's trained to flexibly interpret instructions to show what's inside that area - capturing the subject, context, and framing in a more natural, composed way rather than a strict crop.
32
+
33
+ ![inscene-annotate-samples.png](inscene-annotate-samples.png)
34
+
35
+ *InScene and InScene Annotate are currently in beta.*
36
+
37
+ ## How to Use
38
+
39
+ ### InScene
40
+ To use the base InScene model, start your prompt with:
41
+
42
+ `Make an image in this scene of `
43
+
44
+ And then describe what you want to generate.
45
+
46
+ For example:
47
+ `Make an image in this scene of a bustling city street at night.`
48
+
49
+ ### InScene Annotate
50
+ For the annotate variant, use annotated reference images and start your prompt with:
51
+
52
+ `Based on this annotated scene, create `
53
+
54
+ For example:
55
+ `Based on this annotated scene, create a winter landscape with snow-covered mountains.`
56
+
57
+ ### Use with diffusers
58
+
59
+ **InScene:**
60
+ ```
61
+ import torch
62
+ from diffusers import QwenImageEditPipeline
63
+
64
+ pipe = QwenImageEditPipeline.from_pretrained("Qwen/Qwen-Image-Edit", torch_dtype=torch.bfloat16)
65
+ pipe.to("cuda")
66
+
67
+ pipe.load_lora_weights("peteromallet/Qwen-Image-Edit-InScene", weight_name="InScene-0.7.safetensors")
68
+ ```
69
+
70
+ **InScene Annotate:**
71
+ ```
72
+ import torch
73
+ from diffusers import QwenImageEditPipeline
74
+
75
+ pipe = QwenImageEditPipeline.from_pretrained("Qwen/Qwen-Image-Edit", torch_dtype=torch.bfloat16)
76
+ pipe.to("cuda")
77
+
78
+ pipe.load_lora_weights("peteromallet/Qwen-Image-Edit-InScene", weight_name="InScene-Annotate-0.7.safetensors")
79
+ ```
80
+
81
+ ### Strengths & Weaknesses
82
+
83
+ The models excel at:
84
+ - Capturing scene composition and spatial layout from reference images
85
+ - Maintaining consistent scene structure while varying content
86
+ - Understanding spatial relationships between elements
87
+ - Strong prompt adherence with scene-aware generation
88
+ - (Annotate) Precise control using annotated references
89
+
90
+ The models may struggle with:
91
+ - Very complex multi-layered scenes with numerous elements
92
+ - Extremely abstract or non-traditional scene compositions
93
+ - Fine-grained details that conflict with the reference scene layout
94
+ - Occasional depth perception issues
95
+
96
+ ## Training Data
97
+
98
+ The InScene and InScene Annotate LoRAs were trained on a curated dataset of high-quality Midjourney style references, with additional scene-focused annotations for the Annotate variant. The dataset emphasizes diverse scene compositions and spatial relationships.
99
+
100
+ You can find the public dataset used for training here:
101
+ [https://huggingface.co/datasets/peteromallet/high-quality-midjouney-srefs](https://huggingface.co/datasets/peteromallet/high-quality-midjouney-srefs)
102
+
103
+ ## Links
104
+
105
+ - Model: [https://huggingface.co/peteromallet/Qwen-Image-Edit-InScene](https://huggingface.co/peteromallet/Qwen-Image-Edit-InScene)
106
+ - Dataset: [https://huggingface.co/datasets/peteromallet/high-quality-midjouney-srefs](https://huggingface.co/datasets/peteromallet/high-quality-midjouney-srefs)