meghan3 commited on
Commit
dbad223
1 Parent(s): a1ffd71

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +84 -41
README.md CHANGED
@@ -9,7 +9,7 @@ tags:
9
 
10
  ---
11
 
12
- ![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/controlnet_quantized/web-assets/banner.png)
13
 
14
  # ControlNet: Optimized for Mobile Deployment
15
  ## Generating visual arts from text prompt and input guiding image
@@ -37,10 +37,10 @@ More details on model performance across various devices, can be found
37
 
38
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
39
  | ---|---|---|---|---|---|---|---|
40
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 11.369 ms | 0 - 33 MB | UINT16 | NPU | [TextEncoder_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/TextEncoder_Quantized.so)
41
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 386.746 ms | 0 - 4 MB | UINT16 | NPU | [VAEDecoder_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/VAEDecoder_Quantized.so)
42
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 259.981 ms | 12 - 14 MB | UINT16 | NPU | [UNet_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/UNet_Quantized.so)
43
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 103.748 ms | 0 - 22 MB | UINT16 | NPU | [ControlNet_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_Quantized.so)
44
 
45
 
46
  ## Installation
@@ -67,7 +67,7 @@ Navigate to [docs](https://app.aihub.qualcomm.com/docs/) for more information.
67
 
68
 
69
 
70
- ## Demo off target
71
 
72
  The package contains a simple end-to-end demo that downloads pre-trained
73
  weights and runs this model on a sample input.
@@ -135,11 +135,9 @@ This [export script](https://github.com/quic/ai-hub-models/blob/main/qai_hub_mod
135
  leverages [Qualcomm® AI Hub](https://aihub.qualcomm.com/) to optimize, validate, and deploy this model
136
  on-device. Lets go through each step below in detail:
137
 
138
- Step 1: **Compile model for on-device deployment**
139
-
140
- To compile a PyTorch model for on-device deployment, we first trace the model
141
- in memory using the `jit.trace` and then call the `submit_compile_job` API.
142
 
 
143
  ```python
144
  import torch
145
 
@@ -147,40 +145,39 @@ import qai_hub as hub
147
  from qai_hub_models.models.controlnet_quantized import Model
148
 
149
  # Load the model
150
- torch_model = Model.from_pretrained()
151
- torch_model.eval()
152
-
153
- # Device
154
- device = hub.Device("Samsung Galaxy S23")
155
-
156
- # Trace model
157
- input_shape = torch_model.get_input_spec()
158
- sample_inputs = torch_model.sample_inputs()
159
-
160
- pt_model = torch.jit.trace(torch_model, [torch.tensor(data[0]) for _, data in sample_inputs.items()])
161
-
162
- # Compile model on a specific device
163
- compile_job = hub.submit_compile_job(
164
- model=pt_model,
165
- device=device,
166
- input_specs=torch_model.get_input_spec(),
167
- )
168
-
169
- # Get target model to run on-device
170
- target_model = compile_job.get_target_model()
171
 
 
 
 
 
172
  ```
173
 
174
 
175
  Step 2: **Performance profiling on cloud-hosted device**
176
 
177
- After compiling models from step 1. Models can be profiled model on-device using the
178
  `target_model`. Note that this scripts runs the model on a device automatically
179
  provisioned in the cloud. Once the job is submitted, you can navigate to a
180
  provided job URL to view a variety of on-device performance metrics.
181
  ```python
182
- profile_job = hub.submit_profile_job(
183
- model=target_model,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
184
  device=device,
185
  )
186
 
@@ -191,14 +188,38 @@ Step 3: **Verify on-device accuracy**
191
  To verify the accuracy of the model on-device, you can run on-device inference
192
  on sample input data on the same cloud hosted device.
193
  ```python
194
- input_data = torch_model.sample_inputs()
195
- inference_job = hub.submit_inference_job(
196
- model=target_model,
 
197
  device=device,
198
- inputs=input_data,
199
  )
 
200
 
201
- on_device_output = inference_job.download_output_data()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
202
 
203
  ```
204
  With the output of the model, you can compute like PSNR, relative errors or
@@ -218,9 +239,9 @@ The models can be deployed using multiple runtimes:
218
  guide to deploy the .tflite model in an Android application.
219
 
220
 
221
- - QNN (`.so` export ): This [sample
222
  app](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/sample_app.html)
223
- provides instructions on how to use the `.so` shared library in an Android application.
224
 
225
 
226
  ## View on Qualcomm® AI Hub
@@ -241,3 +262,25 @@ Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
241
  * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
242
 
243
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  ---
11
 
12
+ ![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/controlnet_quantized/web-assets/model_demo.png)
13
 
14
  # ControlNet: Optimized for Mobile Deployment
15
  ## Generating visual arts from text prompt and input guiding image
 
37
 
38
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
39
  | ---|---|---|---|---|---|---|---|
40
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 11.369 ms | 0 - 33 MB | UINT16 | NPU | [TextEncoder_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/TextEncoder_Quantized.bin)
41
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 386.746 ms | 0 - 4 MB | UINT16 | NPU | [VAEDecoder_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/VAEDecoder_Quantized.bin)
42
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 259.981 ms | 12 - 14 MB | UINT16 | NPU | [UNet_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/UNet_Quantized.bin)
43
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 103.748 ms | 0 - 22 MB | UINT16 | NPU | [ControlNet_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_Quantized.bin)
44
 
45
 
46
  ## Installation
 
67
 
68
 
69
 
70
+ ## Demo on-device
71
 
72
  The package contains a simple end-to-end demo that downloads pre-trained
73
  weights and runs this model on a sample input.
 
135
  leverages [Qualcomm® AI Hub](https://aihub.qualcomm.com/) to optimize, validate, and deploy this model
136
  on-device. Lets go through each step below in detail:
137
 
138
+ Step 1: **Upload compiled model**
 
 
 
139
 
140
+ Upload compiled models from `qai_hub_models.models.controlnet_quantized` on hub.
141
  ```python
142
  import torch
143
 
 
145
  from qai_hub_models.models.controlnet_quantized import Model
146
 
147
  # Load the model
148
+ model = Model.from_precompiled()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
149
 
150
+ model_textencoder_quantized = hub.upload_model(model.text_encoder.get_target_model_path())
151
+ model_unet_quantized = hub.upload_model(model.unet.get_target_model_path())
152
+ model_vaedecoder_quantized = hub.upload_model(model.vae_decoder.get_target_model_path())
153
+ model_controlnet_quantized = hub.upload_model(model.controlnet.get_target_model_path())
154
  ```
155
 
156
 
157
  Step 2: **Performance profiling on cloud-hosted device**
158
 
159
+ After uploading compiled models from step 1. Models can be profiled model on-device using the
160
  `target_model`. Note that this scripts runs the model on a device automatically
161
  provisioned in the cloud. Once the job is submitted, you can navigate to a
162
  provided job URL to view a variety of on-device performance metrics.
163
  ```python
164
+
165
+ # Device
166
+ device = hub.Device("Samsung Galaxy S23")
167
+ profile_job_textencoder_quantized = hub.submit_profile_job(
168
+ model=model_textencoder_quantized,
169
+ device=device,
170
+ )
171
+ profile_job_unet_quantized = hub.submit_profile_job(
172
+ model=model_unet_quantized,
173
+ device=device,
174
+ )
175
+ profile_job_vaedecoder_quantized = hub.submit_profile_job(
176
+ model=model_vaedecoder_quantized,
177
+ device=device,
178
+ )
179
+ profile_job_controlnet_quantized = hub.submit_profile_job(
180
+ model=model_controlnet_quantized,
181
  device=device,
182
  )
183
 
 
188
  To verify the accuracy of the model on-device, you can run on-device inference
189
  on sample input data on the same cloud hosted device.
190
  ```python
191
+
192
+ input_data_textencoder_quantized = model.text_encoder.sample_inputs()
193
+ inference_job_textencoder_quantized = hub.submit_inference_job(
194
+ model=model_textencoder_quantized,
195
  device=device,
196
+ inputs=input_data_textencoder_quantized,
197
  )
198
+ on_device_output_textencoder_quantized = inference_job_textencoder_quantized.download_output_data()
199
 
200
+ input_data_unet_quantized = model.unet.sample_inputs()
201
+ inference_job_unet_quantized = hub.submit_inference_job(
202
+ model=model_unet_quantized,
203
+ device=device,
204
+ inputs=input_data_unet_quantized,
205
+ )
206
+ on_device_output_unet_quantized = inference_job_unet_quantized.download_output_data()
207
+
208
+ input_data_vaedecoder_quantized = model.vae_decoder.sample_inputs()
209
+ inference_job_vaedecoder_quantized = hub.submit_inference_job(
210
+ model=model_vaedecoder_quantized,
211
+ device=device,
212
+ inputs=input_data_vaedecoder_quantized,
213
+ )
214
+ on_device_output_vaedecoder_quantized = inference_job_vaedecoder_quantized.download_output_data()
215
+
216
+ input_data_controlnet_quantized = model.controlnet.sample_inputs()
217
+ inference_job_controlnet_quantized = hub.submit_inference_job(
218
+ model=model_controlnet_quantized,
219
+ device=device,
220
+ inputs=input_data_controlnet_quantized,
221
+ )
222
+ on_device_output_controlnet_quantized = inference_job_controlnet_quantized.download_output_data()
223
 
224
  ```
225
  With the output of the model, you can compute like PSNR, relative errors or
 
239
  guide to deploy the .tflite model in an Android application.
240
 
241
 
242
+ - QNN ( `.so` / `.bin` export ): This [sample
243
  app](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/sample_app.html)
244
+ provides instructions on how to use the `.so` shared library or `.bin` context binary in an Android application.
245
 
246
 
247
  ## View on Qualcomm® AI Hub
 
262
  * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
263
 
264
 
265
+ ## Usage and Limitations
266
+
267
+ Model may not be used for or in connection with any of the following applications:
268
+
269
+ - Accessing essential private and public services and benefits;
270
+ - Administration of justice and democratic processes;
271
+ - Assessing or recognizing the emotional state of a person;
272
+ - Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
273
+ - Education and vocational training;
274
+ - Employment and workers management;
275
+ - Exploitation of the vulnerabilities of persons resulting in harmful behavior;
276
+ - General purpose social scoring;
277
+ - Law enforcement;
278
+ - Management and operation of critical infrastructure;
279
+ - Migration, asylum and border control management;
280
+ - Predictive policing;
281
+ - Real-time remote biometric identification in public spaces;
282
+ - Recommender systems of social media platforms;
283
+ - Scraping of facial images (from the internet or otherwise); and/or
284
+ - Subliminal manipulation
285
+
286
+