qaihm-bot commited on
Commit
a734065
1 Parent(s): bf8c5d8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +11 -73
README.md CHANGED
@@ -32,9 +32,12 @@ More details on model performance across various devices, can be found
32
  - Model size: 6.23 MB
33
 
34
 
 
 
35
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
36
  | ---|---|---|---|---|---|---|---|
37
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | TFLite | 4.61 ms | 0 - 4 MB | INT8 | NPU | [Yolo-v7-Quantized.tflite](https://huggingface.co/qualcomm/Yolo-v7-Quantized/blob/main/Yolo-v7-Quantized.tflite)
 
38
 
39
 
40
  ## Installation
@@ -92,83 +95,18 @@ device. This script does the following:
92
  python -m qai_hub_models.models.yolov7_quantized.export
93
  ```
94
 
95
- ## How does this work?
96
-
97
- This [export script](https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/Yolo-v7-Quantized/export.py)
98
- leverages [Qualcomm® AI Hub](https://aihub.qualcomm.com/) to optimize, validate, and deploy this model
99
- on-device. Lets go through each step below in detail:
100
-
101
- Step 1: **Compile model for on-device deployment**
102
-
103
- To compile a PyTorch model for on-device deployment, we first trace the model
104
- in memory using the `jit.trace` and then call the `submit_compile_job` API.
105
-
106
- ```python
107
- import torch
108
-
109
- import qai_hub as hub
110
- from qai_hub_models.models.yolov7_quantized import Model
111
-
112
- # Load the model
113
- torch_model = Model.from_pretrained()
114
- torch_model.eval()
115
-
116
- # Device
117
- device = hub.Device("Samsung Galaxy S23")
118
-
119
- # Trace model
120
- input_shape = torch_model.get_input_spec()
121
- sample_inputs = torch_model.sample_inputs()
122
-
123
- pt_model = torch.jit.trace(torch_model, [torch.tensor(data[0]) for _, data in sample_inputs.items()])
124
-
125
- # Compile model on a specific device
126
- compile_job = hub.submit_compile_job(
127
- model=pt_model,
128
- device=device,
129
- input_specs=torch_model.get_input_spec(),
130
- )
131
-
132
- # Get target model to run on-device
133
- target_model = compile_job.get_target_model()
134
-
135
  ```
 
 
 
 
 
 
136
 
137
 
138
- Step 2: **Performance profiling on cloud-hosted device**
139
-
140
- After compiling models from step 1. Models can be profiled model on-device using the
141
- `target_model`. Note that this scripts runs the model on a device automatically
142
- provisioned in the cloud. Once the job is submitted, you can navigate to a
143
- provided job URL to view a variety of on-device performance metrics.
144
- ```python
145
- profile_job = hub.submit_profile_job(
146
- model=target_model,
147
- device=device,
148
- )
149
-
150
  ```
151
 
152
- Step 3: **Verify on-device accuracy**
153
-
154
- To verify the accuracy of the model on-device, you can run on-device inference
155
- on sample input data on the same cloud hosted device.
156
- ```python
157
- input_data = torch_model.sample_inputs()
158
- inference_job = hub.submit_inference_job(
159
- model=target_model,
160
- device=device,
161
- inputs=input_data,
162
- )
163
-
164
- on_device_output = inference_job.download_output_data()
165
-
166
- ```
167
- With the output of the model, you can compute like PSNR, relative errors or
168
- spot check the output with expected output.
169
 
170
- **Note**: This on-device profiling and inference requires access to Qualcomm®
171
- AI Hub. [Sign up for access](https://myaccount.qualcomm.com/signup).
172
 
173
 
174
  ## Run demo on a cloud-hosted device
@@ -207,7 +145,7 @@ Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
207
  ## License
208
  - The license for the original implementation of Yolo-v7-Quantized can be found
209
  [here](https://github.com/WongKinYiu/yolov7/blob/main/LICENSE.md).
210
- - The license for the compiled assets for on-device deployment can be found [here]({deploy_license_url})
211
 
212
  ## References
213
  * [YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors](https://arxiv.org/abs/2207.02696)
 
32
  - Model size: 6.23 MB
33
 
34
 
35
+
36
+
37
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
38
  | ---|---|---|---|---|---|---|---|
39
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | TFLite | 4.596 ms | 0 - 2 MB | INT8 | NPU | [Yolo-v7-Quantized.tflite](https://huggingface.co/qualcomm/Yolo-v7-Quantized/blob/main/Yolo-v7-Quantized.tflite)
40
+
41
 
42
 
43
  ## Installation
 
95
  python -m qai_hub_models.models.yolov7_quantized.export
96
  ```
97
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
98
  ```
99
+ Profile Job summary of Yolo-v7-Quantized
100
+ --------------------------------------------------
101
+ Device: RB5 (Proxy) (12)
102
+ Estimated Inference Time: 93.32 ms
103
+ Estimated Peak Memory Range: 8.36-44.24 MB
104
+ Compute Units: NPU (32),GPU (126),CPU (68) | Total (226)
105
 
106
 
 
 
 
 
 
 
 
 
 
 
 
 
107
  ```
108
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
 
 
 
110
 
111
 
112
  ## Run demo on a cloud-hosted device
 
145
  ## License
146
  - The license for the original implementation of Yolo-v7-Quantized can be found
147
  [here](https://github.com/WongKinYiu/yolov7/blob/main/LICENSE.md).
148
+ - The license for the compiled assets for on-device deployment can be found [here](https://github.com/WongKinYiu/yolov7/blob/main/LICENSE.md)
149
 
150
  ## References
151
  * [YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors](https://arxiv.org/abs/2207.02696)