aningineer commited on
Commit
aeccac3
1 Parent(s): e07cbb0

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +17 -3
  2. app.py +13 -2
README.md CHANGED
@@ -1,13 +1,28 @@
1
  ---
2
  title: ToDo
 
3
  app_file: app.py
4
  sdk: gradio
5
  sdk_version: 4.19.2
6
  ---
7
- # ImprovedTokenMerge
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ![GEuoFn1bMAABQqD](https://github.com/ethansmith2000/ImprovedTokenMerge/assets/98723285/82e03423-81e6-47da-afa4-9c1b2c1c4aeb)
9
 
10
- twitter thread explanation: https://twitter.com/Ethan_smith_20/status/1750533558509433137
11
 
12
  heavily inspired by https://github.com/dbolya/tomesd by @dbolya, a big thanks to the original authors.
13
 
@@ -15,7 +30,6 @@ This project aims to adress some of the shortcomings of Token Merging for Stable
15
  I found with the original that you would have to use a high merging ratio to get really any speedups at all, and by then quality was tarnished. Benchmarks here: https://github.com/dbolya/tomesd/issues/19#issuecomment-1507593483
16
 
17
 
18
-
19
  I propose two changes to the original to solve this.
20
  1. Merging Method
21
  - the original calculates a similarity matrix of the input tokens and merges those with highest similarity
 
1
  ---
2
  title: ToDo
3
+ emoji: 🔥
4
  app_file: app.py
5
  sdk: gradio
6
  sdk_version: 4.19.2
7
  ---
8
+ # ToDo: Token Downsampling for Efficient Generation of High-Resolution Images
9
+ ---
10
+
11
+ This is a demo for our recently proposed method, ["ToDo: Token Downsampling for Efficient Generation of High-Resolution Images"](https://arxiv.org/abs/2402.13573), compared against a popular token merging method, ToMe.
12
+
13
+ ```
14
+ @misc{smith2024todo,
15
+ title={ToDo: Token Downsampling for Efficient Generation of High-Resolution Images},
16
+ author={Ethan Smith and Nayan Saxena and Aninda Saha},
17
+ year={2024},
18
+ eprint={2402.13573},
19
+ archivePrefix={arXiv}
20
+ }
21
+ ```
22
+
23
  ![GEuoFn1bMAABQqD](https://github.com/ethansmith2000/ImprovedTokenMerge/assets/98723285/82e03423-81e6-47da-afa4-9c1b2c1c4aeb)
24
 
25
+ blog post: https://sweet-hall-e72.notion.site/ToDo-Token-Downsampling-for-Efficient-Generation-of-High-Resolution-Images-b41be1ac8ddc46be8cd687e67dee2d84?pvs=4
26
 
27
  heavily inspired by https://github.com/dbolya/tomesd by @dbolya, a big thanks to the original authors.
28
 
 
30
  I found with the original that you would have to use a high merging ratio to get really any speedups at all, and by then quality was tarnished. Benchmarks here: https://github.com/dbolya/tomesd/issues/19#issuecomment-1507593483
31
 
32
 
 
33
  I propose two changes to the original to solve this.
34
  1. Merging Method
35
  - the original calculates a similarity matrix of the input tokens and merges those with highest similarity
app.py CHANGED
@@ -8,6 +8,15 @@ import math
8
  import numpy as np
9
  from PIL import Image
10
 
 
 
 
 
 
 
 
 
 
11
  pipe = diffusers.StableDiffusionPipeline.from_pretrained("Lykon/DreamShaper").to("cuda", torch.float16)
12
  pipe.scheduler = diffusers.EulerDiscreteScheduler.from_config(pipe.scheduler.config)
13
  pipe.safety_checker = None
@@ -70,8 +79,10 @@ def generate(prompt, seed, steps, height_width, negative_prompt, guidance_scale,
70
 
71
  return base_img, merged_img, result
72
 
73
- with gr.Blocks() as demo:
74
- gr.Label("ToDo: Token Downsampling for Efficient Generation of High-Resolution Images")
 
 
75
  prompt = gr.Textbox(interactive=True, label="prompt")
76
  negative_prompt = gr.Textbox(interactive=True, label="negative_prompt")
77
 
 
8
  import numpy as np
9
  from PIL import Image
10
 
11
+ # Globals
12
+ css = """
13
+ h1 {
14
+ text-align: center;
15
+ display: block;
16
+ }
17
+ """
18
+
19
+ # Pipeline
20
  pipe = diffusers.StableDiffusionPipeline.from_pretrained("Lykon/DreamShaper").to("cuda", torch.float16)
21
  pipe.scheduler = diffusers.EulerDiscreteScheduler.from_config(pipe.scheduler.config)
22
  pipe.safety_checker = None
 
79
 
80
  return base_img, merged_img, result
81
 
82
+
83
+
84
+ with gr.Blocks(css=css) as demo:
85
+ gr.Markdown("# ToDo: Token Downsampling for Efficient Generation of High-Resolution Images")
86
  prompt = gr.Textbox(interactive=True, label="prompt")
87
  negative_prompt = gr.Textbox(interactive=True, label="negative_prompt")
88