File size: 3,287 Bytes
ce9cceb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Hm, for some reason my workflow comment is not appearing. Hopefully this works:

Positive Prompt: photo of a (skinny woman:1.3) posing dramatically, hand on hip, leaning on wooden crate, standing, finely detailed features, wide angle, nautical, grimy industrial port, outdoors, stunning photo, cinematic lighting, ({1-2$$blemishes|acne|freckles}:0.5) ADDBASE [English|Lebanese] woman, (age 40:1.4), (black hair in a tight bun:1.2), hand resting on head, smile, (eyes closed:1.3), (big forehead, big nose:0.4), earring studs, skinny eyebrows, cloudy sky, seagulls flying in distance, BREAK (red shirt:1.4), (small breasts, flat chest:1.2), boats, BREAK (long black skirt:1.3), cotton tube skirt, wood dock, BREAK black tube skirt, (yellow skirt hemline, embroidered band on skirt:1.3), wooden crates, ropes BREAK brown leather boots, tall boots, deck boards, ropes

Divide Ratio: 22,20,26,7,29

Negative Prompt: low quality, mutated, deformed, 3d model, (blurry:1.3), cartoon, b&w, out of focus, out of frame, closeup, child, teen, asian, selfie, leggings, smooth skin, (breasts:1.3), nametag, (head tilted up:1.5)

ControlNet: scribble_pidinet + openpose

Model: realisticVision v2

Basically, I used a rough-looking scribble to generalize the form of cartoon, and traced the pose in the OpenPose extension. I had a tough time assigning prompts to certain parts of the image, so I used the Regional Prompter extension.

To get the areas to prompt, I measured from the top of the image until her shoulders, which was 110px. Out of the 500px tall image, that's 22%. Now I could prompt for her head and the sky for the first segment. Next, her shirt at 20%, and her skirt at 26%. I made a very narrow 7% rectangle for the yellow band, and her boots at 29%. This add up to 104% but it doesn't need to be perfect. Thus my Divide Ratio field was 22,20,26,7,29

I first described the general image, which was more or less in all the segments, and used the special ADDBASE command at the end: photo of a (skinny woman:1.3) posing dramatically, hand on hip, leaning on wooden crate, standing, finely detailed features, wide angle, nautical, grimy industrial port, outdoors, stunning photo, cinematic lighting, ({1-2$$blemishes|acne|freckles}:0.5) ADDBASE

Now for the segments, there's the special BREAK command at the end of each segment prompt. So for the topmost segment, I described the top of the image (not just the foreground): [English|Lebanese] woman, (age 40:1.4), (black hair in a tight bun:1.2), hand resting on head, smile, (eyes closed:1.3), (big forehead, big nose:0.4), earring studs, skinny eyebrows, cloudy sky, seagulls flying in distance, BREAK

Then her shirt (trying to fight the default big boobage): (red shirt:1.4), (small breasts, flat chest:1.2), boats, BREAK

And so on. I used the negative prompt as a global negative, since it applied to the entire image.

The prompts didn't originally have all the emphases in parentheses, but they were ultimately needed, as I was fighting a lot of recurring artifacts. For example, it kept giving her a name tag like a Staples employee!

I did fix some of the usual suspects using inpainting (hands) for the final result, then upscaled. It's still pretty uncanny valley but a fun way to learn a new extension. Edit: formatting