qualcomm
/

ControlNet

Unconditional Image Generation

PyTorch

generative_ai

quantized

android

Model card Files Files and versions Community

meghan3 commited on Feb 29, 2024

Commit

dccecda

verified ·

1 Parent(s): a3b746b

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +41 -84

README.md CHANGED Viewed

@@ -9,7 +9,7 @@ tags:
 ---
-![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/controlnet_quantized/web-assets/model_demo.png)
 # ControlNet: Optimized for Mobile Deployment
 ## Generating visual arts from text prompt and input guiding image
@@ -37,10 +37,10 @@ More details on model performance across various devices, can be found
 | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
 | ---|---|---|---|---|---|---|---|
-| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 11.369 ms | 0 - 33 MB | UINT16 | NPU |  [TextEncoder_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/TextEncoder_Quantized.bin)
-| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 386.746 ms | 0 - 4 MB | UINT16 | NPU |  [VAEDecoder_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/VAEDecoder_Quantized.bin)
-| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 259.981 ms | 12 - 14 MB | UINT16 | NPU |  [UNet_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/UNet_Quantized.bin)
-| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Binary | 103.748 ms | 0 - 22 MB | UINT16 | NPU |  [ControlNet_Quantized.bin](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_Quantized.bin)
 ## Installation
@@ -67,7 +67,7 @@ Navigate to [docs](https://app.aihub.qualcomm.com/docs/) for more information.
-## Demo on-device
 The package contains a simple end-to-end demo that downloads pre-trained
 weights and runs this model on a sample input.
@@ -135,9 +135,11 @@ This [export script](https://github.com/quic/ai-hub-models/blob/main/qai_hub_mod
 leverages [Qualcomm® AI Hub](https://aihub.qualcomm.com/) to optimize, validate, and deploy this model
 on-device. Lets go through each step below in detail:
-Step 1: **Upload compiled model**
-Upload compiled models from `qai_hub_models.models.controlnet_quantized` on hub.
 ```python
 import torch
@@ -145,39 +147,40 @@ import qai_hub as hub
 from qai_hub_models.models.controlnet_quantized import Model
 # Load the model
-model = Model.from_precompiled()
-model_textencoder_quantized = hub.upload_model(model.text_encoder.get_target_model_path())
-model_unet_quantized = hub.upload_model(model.unet.get_target_model_path())
-model_vaedecoder_quantized = hub.upload_model(model.vae_decoder.get_target_model_path())
-model_controlnet_quantized = hub.upload_model(model.controlnet.get_target_model_path())
 ```
 Step 2: **Performance profiling on cloud-hosted device**
-After uploading compiled models from step 1. Models can be profiled model on-device using the
 `target_model`. Note that this scripts runs the model on a device automatically
 provisioned in the cloud.  Once the job is submitted, you can navigate to a
 provided job URL to view a variety of on-device performance metrics.
 ```python
-# Device
-device = hub.Device("Samsung Galaxy S23")
-profile_job_textencoder_quantized = hub.submit_profile_job(
-    model=model_textencoder_quantized,
-    device=device,
-)
-profile_job_unet_quantized = hub.submit_profile_job(
-    model=model_unet_quantized,
-    device=device,
-)
-profile_job_vaedecoder_quantized = hub.submit_profile_job(
-    model=model_vaedecoder_quantized,
-    device=device,
-)
-profile_job_controlnet_quantized = hub.submit_profile_job(
-    model=model_controlnet_quantized,
     device=device,
 )
@@ -188,38 +191,14 @@ Step 3: **Verify on-device accuracy**
 To verify the accuracy of the model on-device, you can run on-device inference
 on sample input data on the same cloud hosted device.
 ```python
-input_data_textencoder_quantized = model.text_encoder.sample_inputs()
-inference_job_textencoder_quantized = hub.submit_inference_job(
-    model=model_textencoder_quantized,
     device=device,
-    inputs=input_data_textencoder_quantized,
 )
-on_device_output_textencoder_quantized = inference_job_textencoder_quantized.download_output_data()
-input_data_unet_quantized = model.unet.sample_inputs()
-inference_job_unet_quantized = hub.submit_inference_job(
-    model=model_unet_quantized,
-    device=device,
-    inputs=input_data_unet_quantized,
-)
-on_device_output_unet_quantized = inference_job_unet_quantized.download_output_data()
-input_data_vaedecoder_quantized = model.vae_decoder.sample_inputs()
-inference_job_vaedecoder_quantized = hub.submit_inference_job(
-    model=model_vaedecoder_quantized,
-    device=device,
-    inputs=input_data_vaedecoder_quantized,
-)
-on_device_output_vaedecoder_quantized = inference_job_vaedecoder_quantized.download_output_data()
-input_data_controlnet_quantized = model.controlnet.sample_inputs()
-inference_job_controlnet_quantized = hub.submit_inference_job(
-    model=model_controlnet_quantized,
-    device=device,
-    inputs=input_data_controlnet_quantized,
-)
-on_device_output_controlnet_quantized = inference_job_controlnet_quantized.download_output_data()
 ```
 With the output of the model, you can compute like PSNR, relative errors or
@@ -239,9 +218,9 @@ The models can be deployed using multiple runtimes:
   guide to deploy the .tflite model in an Android application.
-- QNN ( `.so` / `.bin` export ): This [sample
   app](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/sample_app.html)
-provides instructions on how to use the `.so` shared library or `.bin` context binary in an Android application.
 ## View on Qualcomm® AI Hub
@@ -262,25 +241,3 @@ Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
-## Usage and Limitations
-Model may not be used for or in connection with any of the following applications:
-- Accessing essential private and public services and benefits;
-- Administration of justice and democratic processes;
-- Assessing or recognizing the emotional state of a person;
-- Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
-- Education and vocational training;
-- Employment and workers management;
-- Exploitation of the vulnerabilities of persons resulting in harmful behavior;
-- General purpose social scoring;
-- Law enforcement;
-- Management and operation of critical infrastructure;
-- Migration, asylum and border control management;
-- Predictive policing;
-- Real-time remote biometric identification in public spaces;
-- Recommender systems of social media platforms;
-- Scraping of facial images (from the internet or otherwise); and/or
-- Subliminal manipulation

 ---
+![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/controlnet_quantized/web-assets/banner.png)
 # ControlNet: Optimized for Mobile Deployment
 ## Generating visual arts from text prompt and input guiding image
 | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
 | ---|---|---|---|---|---|---|---|
+| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 11.369 ms | 0 - 33 MB | UINT16 | NPU |  [TextEncoder_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/TextEncoder_Quantized.so)
+| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 386.746 ms | 0 - 4 MB | UINT16 | NPU |  [VAEDecoder_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/VAEDecoder_Quantized.so)
+| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 259.981 ms | 12 - 14 MB | UINT16 | NPU |  [UNet_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/UNet_Quantized.so)
+| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 103.748 ms | 0 - 22 MB | UINT16 | NPU |  [ControlNet_Quantized.so](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_Quantized.so)
 ## Installation
+## Demo off target
 The package contains a simple end-to-end demo that downloads pre-trained
 weights and runs this model on a sample input.
 leverages [Qualcomm® AI Hub](https://aihub.qualcomm.com/) to optimize, validate, and deploy this model
 on-device. Lets go through each step below in detail:
+Step 1: **Compile model for on-device deployment**
+To compile a PyTorch model for on-device deployment, we first trace the model
+in memory using the `jit.trace` and then call the `submit_compile_job` API.
 ```python
 import torch
 from qai_hub_models.models.controlnet_quantized import Model
 # Load the model
+torch_model = Model.from_pretrained()
+torch_model.eval()
+# Device
+device = hub.Device("Samsung Galaxy S23")
+# Trace model
+input_shape = torch_model.get_input_spec()
+sample_inputs = torch_model.sample_inputs()
+pt_model = torch.jit.trace(torch_model, [torch.tensor(data[0]) for _, data in sample_inputs.items()])
+# Compile model on a specific device
+compile_job = hub.submit_compile_job(
+    model=pt_model,
+    device=device,
+    input_specs=torch_model.get_input_spec(),
+)
+# Get target model to run on-device
+target_model = compile_job.get_target_model()
 ```
 Step 2: **Performance profiling on cloud-hosted device**
+After compiling models from step 1. Models can be profiled model on-device using the
 `target_model`. Note that this scripts runs the model on a device automatically
 provisioned in the cloud.  Once the job is submitted, you can navigate to a
 provided job URL to view a variety of on-device performance metrics.
 ```python
+profile_job = hub.submit_profile_job(
+    model=target_model,
     device=device,
 )
 To verify the accuracy of the model on-device, you can run on-device inference
 on sample input data on the same cloud hosted device.
 ```python
+input_data = torch_model.sample_inputs()
+inference_job = hub.submit_inference_job(
+    model=target_model,
     device=device,
+    inputs=input_data,
 )
+on_device_output = inference_job.download_output_data()
 ```
 With the output of the model, you can compute like PSNR, relative errors or
   guide to deploy the .tflite model in an Android application.
+- QNN (`.so` export ): This [sample
   app](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/sample_app.html)
+provides instructions on how to use the `.so` shared library  in an Android application.
 ## View on Qualcomm® AI Hub
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).