meghan3 commited on
Commit
dccecda
1 Parent(s): a3b746b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +41 -84
README.md CHANGED
@@ -9,7 +9,7 @@ tags:
9
 
10
  ---
11
 
12
- ![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/controlnet_quantized/web-assets/model_demo.png)
13
 
14
  # ControlNet: Optimized for Mobile Deployment
15
  ## Generating visual arts from text prompt and input guiding image
@@ -37,10 +37,10 @@ More details on model performance across various devices, can be found
37
 
38
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
39
  | ---|---|---|---|---|---|---|---|
40
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 11.369 ms | 0 - 33 MB | UINT16 | NPU | [TextEncoder_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/TextEncoder_Quantized.bin)
41
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 386.746 ms | 0 - 4 MB | UINT16 | NPU | [VAEDecoder_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/VAEDecoder_Quantized.bin)
42
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 259.981 ms | 12 - 14 MB | UINT16 | NPU | [UNet_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/UNet_Quantized.bin)
43
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 103.748 ms | 0 - 22 MB | UINT16 | NPU | [ControlNet_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_Quantized.bin)
44
 
45
 
46
  ## Installation
@@ -67,7 +67,7 @@ Navigate to [docs](https://app.aihub.qualcomm.com/docs/) for more information.
67
 
68
 
69
 
70
- ## Demo on-device
71
 
72
  The package contains a simple end-to-end demo that downloads pre-trained
73
  weights and runs this model on a sample input.
@@ -135,9 +135,11 @@ This [export script](https://github.com/quic/ai-hub-models/blob/main/qai_hub_mod
135
  leverages [Qualcomm® AI Hub](https://aihub.qualcomm.com/) to optimize, validate, and deploy this model
136
  on-device. Lets go through each step below in detail:
137
 
138
- Step 1: **Upload compiled model**
 
 
 
139
 
140
- Upload compiled models from `qai_hub_models.models.controlnet_quantized` on hub.
141
  ```python
142
  import torch
143
 
@@ -145,39 +147,40 @@ import qai_hub as hub
145
  from qai_hub_models.models.controlnet_quantized import Model
146
 
147
  # Load the model
148
- model = Model.from_precompiled()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
149
 
150
- model_textencoder_quantized = hub.upload_model(model.text_encoder.get_target_model_path())
151
- model_unet_quantized = hub.upload_model(model.unet.get_target_model_path())
152
- model_vaedecoder_quantized = hub.upload_model(model.vae_decoder.get_target_model_path())
153
- model_controlnet_quantized = hub.upload_model(model.controlnet.get_target_model_path())
154
  ```
155
 
156
 
157
  Step 2: **Performance profiling on cloud-hosted device**
158
 
159
- After uploading compiled models from step 1. Models can be profiled model on-device using the
160
  `target_model`. Note that this scripts runs the model on a device automatically
161
  provisioned in the cloud. Once the job is submitted, you can navigate to a
162
  provided job URL to view a variety of on-device performance metrics.
163
  ```python
164
-
165
- # Device
166
- device = hub.Device("Samsung Galaxy S23")
167
- profile_job_textencoder_quantized = hub.submit_profile_job(
168
- model=model_textencoder_quantized,
169
- device=device,
170
- )
171
- profile_job_unet_quantized = hub.submit_profile_job(
172
- model=model_unet_quantized,
173
- device=device,
174
- )
175
- profile_job_vaedecoder_quantized = hub.submit_profile_job(
176
- model=model_vaedecoder_quantized,
177
- device=device,
178
- )
179
- profile_job_controlnet_quantized = hub.submit_profile_job(
180
- model=model_controlnet_quantized,
181
  device=device,
182
  )
183
 
@@ -188,38 +191,14 @@ Step 3: **Verify on-device accuracy**
188
  To verify the accuracy of the model on-device, you can run on-device inference
189
  on sample input data on the same cloud hosted device.
190
  ```python
191
-
192
- input_data_textencoder_quantized = model.text_encoder.sample_inputs()
193
- inference_job_textencoder_quantized = hub.submit_inference_job(
194
- model=model_textencoder_quantized,
195
  device=device,
196
- inputs=input_data_textencoder_quantized,
197
  )
198
- on_device_output_textencoder_quantized = inference_job_textencoder_quantized.download_output_data()
199
 
200
- input_data_unet_quantized = model.unet.sample_inputs()
201
- inference_job_unet_quantized = hub.submit_inference_job(
202
- model=model_unet_quantized,
203
- device=device,
204
- inputs=input_data_unet_quantized,
205
- )
206
- on_device_output_unet_quantized = inference_job_unet_quantized.download_output_data()
207
-
208
- input_data_vaedecoder_quantized = model.vae_decoder.sample_inputs()
209
- inference_job_vaedecoder_quantized = hub.submit_inference_job(
210
- model=model_vaedecoder_quantized,
211
- device=device,
212
- inputs=input_data_vaedecoder_quantized,
213
- )
214
- on_device_output_vaedecoder_quantized = inference_job_vaedecoder_quantized.download_output_data()
215
-
216
- input_data_controlnet_quantized = model.controlnet.sample_inputs()
217
- inference_job_controlnet_quantized = hub.submit_inference_job(
218
- model=model_controlnet_quantized,
219
- device=device,
220
- inputs=input_data_controlnet_quantized,
221
- )
222
- on_device_output_controlnet_quantized = inference_job_controlnet_quantized.download_output_data()
223
 
224
  ```
225
  With the output of the model, you can compute like PSNR, relative errors or
@@ -239,9 +218,9 @@ The models can be deployed using multiple runtimes:
239
  guide to deploy the .tflite model in an Android application.
240
 
241
 
242
- - QNN ( `.so` / `.bin` export ): This [sample
243
  app](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/sample_app.html)
244
- provides instructions on how to use the `.so` shared library or `.bin` context binary in an Android application.
245
 
246
 
247
  ## View on Qualcomm® AI Hub
@@ -262,25 +241,3 @@ Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
262
  * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
263
 
264
 
265
- ## Usage and Limitations
266
-
267
- Model may not be used for or in connection with any of the following applications:
268
-
269
- - Accessing essential private and public services and benefits;
270
- - Administration of justice and democratic processes;
271
- - Assessing or recognizing the emotional state of a person;
272
- - Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
273
- - Education and vocational training;
274
- - Employment and workers management;
275
- - Exploitation of the vulnerabilities of persons resulting in harmful behavior;
276
- - General purpose social scoring;
277
- - Law enforcement;
278
- - Management and operation of critical infrastructure;
279
- - Migration, asylum and border control management;
280
- - Predictive policing;
281
- - Real-time remote biometric identification in public spaces;
282
- - Recommender systems of social media platforms;
283
- - Scraping of facial images (from the internet or otherwise); and/or
284
- - Subliminal manipulation
285
-
286
-
 
9
 
10
  ---
11
 
12
+ ![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/controlnet_quantized/web-assets/banner.png)
13
 
14
  # ControlNet: Optimized for Mobile Deployment
15
  ## Generating visual arts from text prompt and input guiding image
 
37
 
38
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
39
  | ---|---|---|---|---|---|---|---|
40
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 11.369 ms | 0 - 33 MB | UINT16 | NPU | [TextEncoder_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/TextEncoder_Quantized.so)
41
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 386.746 ms | 0 - 4 MB | UINT16 | NPU | [VAEDecoder_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/VAEDecoder_Quantized.so)
42
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 259.981 ms | 12 - 14 MB | UINT16 | NPU | [UNet_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/UNet_Quantized.so)
43
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 103.748 ms | 0 - 22 MB | UINT16 | NPU | [ControlNet_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_Quantized.so)
44
 
45
 
46
  ## Installation
 
67
 
68
 
69
 
70
+ ## Demo off target
71
 
72
  The package contains a simple end-to-end demo that downloads pre-trained
73
  weights and runs this model on a sample input.
 
135
  leverages [Qualcomm® AI Hub](https://aihub.qualcomm.com/) to optimize, validate, and deploy this model
136
  on-device. Lets go through each step below in detail:
137
 
138
+ Step 1: **Compile model for on-device deployment**
139
+
140
+ To compile a PyTorch model for on-device deployment, we first trace the model
141
+ in memory using the `jit.trace` and then call the `submit_compile_job` API.
142
 
 
143
  ```python
144
  import torch
145
 
 
147
  from qai_hub_models.models.controlnet_quantized import Model
148
 
149
  # Load the model
150
+ torch_model = Model.from_pretrained()
151
+ torch_model.eval()
152
+
153
+ # Device
154
+ device = hub.Device("Samsung Galaxy S23")
155
+
156
+ # Trace model
157
+ input_shape = torch_model.get_input_spec()
158
+ sample_inputs = torch_model.sample_inputs()
159
+
160
+ pt_model = torch.jit.trace(torch_model, [torch.tensor(data[0]) for _, data in sample_inputs.items()])
161
+
162
+ # Compile model on a specific device
163
+ compile_job = hub.submit_compile_job(
164
+ model=pt_model,
165
+ device=device,
166
+ input_specs=torch_model.get_input_spec(),
167
+ )
168
+
169
+ # Get target model to run on-device
170
+ target_model = compile_job.get_target_model()
171
 
 
 
 
 
172
  ```
173
 
174
 
175
  Step 2: **Performance profiling on cloud-hosted device**
176
 
177
+ After compiling models from step 1. Models can be profiled model on-device using the
178
  `target_model`. Note that this scripts runs the model on a device automatically
179
  provisioned in the cloud. Once the job is submitted, you can navigate to a
180
  provided job URL to view a variety of on-device performance metrics.
181
  ```python
182
+ profile_job = hub.submit_profile_job(
183
+ model=target_model,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
184
  device=device,
185
  )
186
 
 
191
  To verify the accuracy of the model on-device, you can run on-device inference
192
  on sample input data on the same cloud hosted device.
193
  ```python
194
+ input_data = torch_model.sample_inputs()
195
+ inference_job = hub.submit_inference_job(
196
+ model=target_model,
 
197
  device=device,
198
+ inputs=input_data,
199
  )
 
200
 
201
+ on_device_output = inference_job.download_output_data()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
202
 
203
  ```
204
  With the output of the model, you can compute like PSNR, relative errors or
 
218
  guide to deploy the .tflite model in an Android application.
219
 
220
 
221
+ - QNN (`.so` export ): This [sample
222
  app](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/sample_app.html)
223
+ provides instructions on how to use the `.so` shared library in an Android application.
224
 
225
 
226
  ## View on Qualcomm® AI Hub
 
241
  * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
242
 
243