masterful
/

gligen-1-4-generation-text-box

StableDiffusionPipeline

stable-diffusion

stable-diffusion-diffusers

Model card Files Files and versions Community

nikhilg commited on Aug 16, 2023

Commit

13309b8

·

1 Parent(s): 0c6ed4f

Update README.md

Files changed (1) hide show

README.md +17 -19

README.md CHANGED Viewed

@@ -68,32 +68,30 @@ Running the pipeline with the default PNDM scheduler:
 ```python
 import torch
-import torchvision
 from diffusers import StableDiffusionGLIGENPipeline
-model_id = "masterful/gligen-1-4-generation-text-box"
-device = "cuda"
-pipe = StableDiffusionGLIGENPipeline.from_pretrained(model_id, variant="fp16", torch_dtype=torch.float16)
-pipe = pipe.to(device)
 prompt = "a waterfall and a modern high speed train running through the tunnel in a beautiful forest with fall foliage"
 images = pipe(
-    prompt,
-    num_images_per_prompt=1,
-    gligen_phrases = ['a waterfall', 'a modern high speed train running through the tunnel'],
-    gligen_boxes = [
-        [0.1387, 0.2051, 0.4277, 0.7090],
-        [0.4980, 0.4355, 0.8516, 0.7266],
-    ],
-    gligen_scheduled_sampling_beta=0.3,
-    output_type="np",
-    num_inference_steps=50
 ).images
-images = torch.stack([torch.from_numpy(image) for image in images]).permute(0, 3, 1, 2)
-torchvision.utils.save_image(images, "./gligen-1-4-generation-text-box.jpg", nrow=1, normalize=False)
 ```
@@ -181,4 +179,4 @@ Refer [`GLIGEN`](https://github.com/gligen/GLIGEN) for more details.
     }
 ```
-*This model card was written by: Robin Rombach and Patrick Esser and is based on the [DALL-E Mini model card](https://huggingface.co/dalle-mini/dalle-mini).*

 ```python
 import torch
 from diffusers import StableDiffusionGLIGENPipeline
+from diffusers.utils import load_image
+# Generate an image described by the prompt and
+# insert objects described by text at the region defined by bounding boxes
+pipe = StableDiffusionGLIGENPipeline.from_pretrained(
+    "masterful/gligen-1-4-generation-text-box", variant="fp16", torch_dtype=torch.float16
+)
+pipe = pipe.to("cuda")
 prompt = "a waterfall and a modern high speed train running through the tunnel in a beautiful forest with fall foliage"
+boxes = [[0.1387, 0.2051, 0.4277, 0.7090], [0.4980, 0.4355, 0.8516, 0.7266]]
+phrases = ["a waterfall", "a modern high speed train running through the tunnel"]
 images = pipe(
+    prompt=prompt,
+    gligen_phrases=phrases,
+    gligen_boxes=boxes,
+    gligen_scheduled_sampling_beta=1,
+    output_type="pil",
+    num_inference_steps=50,
 ).images
+images[0].save("./gligen-1-4-generation-text-box.jpg")
 ```
     }
 ```
+*This model card was written by: [Nikhil Gajendrakumar](https://github.com/nikhil-masterful) and is based on the [DALL-E Mini model card](https://huggingface.co/dalle-mini/dalle-mini).*