fxmarty HF staff commited on
Commit
7d060af
1 Parent(s): 99a93e1

add experience

Browse files
README.md ADDED
@@ -0,0 +1,180 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: image-classification
3
+ datasets:
4
+ - beans
5
+ metrics:
6
+ - accuracy
7
+ tags:
8
+ - vit
9
+ ---
10
+
11
+ **task**: `image-classification`
12
+ **Backend:** `sagemaker-training`
13
+ **Backend args:** `{'instance_type': 'ml.g4dn.2xlarge', 'supported_instructions': None}`
14
+ **Number of evaluation samples:** `All dataset`
15
+
16
+ Fixed parameters:
17
+ * **model_name_or_path**: `nateraw/vit-base-beans`
18
+ * **dataset**:
19
+ * **path**: `beans`
20
+ * **eval_split**: `validation`
21
+ * **data_keys**: `{'primary': 'image'}`
22
+ * **ref_keys**: `['labels']`
23
+ * **calibration_split**: `train`
24
+ * **quantization_approach**: `dynamic`
25
+ * **calibration**:
26
+ * **method**: `minmax`
27
+ * **num_calibration_samples**: `100`
28
+ * **framework**: `onnxruntime`
29
+ * **framework_args**:
30
+ * **opset**: `11`
31
+ * **optimization_level**: `1`
32
+ * **aware_training**: `False`
33
+
34
+ Benchmarked parameters:
35
+ * **operators_to_quantize**: `['Add']`, `['Add', 'MatMul']`
36
+ * **node_exclusion**: `[]`, `['layernorm', 'gelu', 'residual', 'gather', 'softmax']`
37
+ * **per_channel**: `False`, `True`
38
+
39
+ # Evaluation
40
+ ## Non-time metrics
41
+ | operators_to_quantize | node_exclusion | per_channel | | accuracy (original) | accuracy (optimized) |
42
+ | :-------------------: | :------------------------------------------------------: | :---------: | :-: | :-----------------: | :------------------: |
43
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 0.980 | 0.980 |
44
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 0.980 | 0.980 |
45
+ | `['Add', 'MatMul']` | `[]` | `False` | \| | 0.980 | 0.980 |
46
+ | `['Add', 'MatMul']` | `[]` | `True` | \| | 0.980 | 0.980 |
47
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 0.980 | 0.980 |
48
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 0.980 | 0.980 |
49
+ | `['Add']` | `[]` | `False` | \| | 0.980 | 0.980 |
50
+ | `['Add']` | `[]` | `True` | \| | 0.980 | 0.980 |
51
+
52
+ ## Time metrics
53
+ Time benchmarks were run for 15 seconds per config.
54
+
55
+
56
+ Below, time metrics for batch size = 1, input length = 32.
57
+
58
+ | operators_to_quantize | node_exclusion | per_channel | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
59
+ | :-------------------: | :------------------------------------------------------: | :---------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
60
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 200.50 | 63.00 | \| | 5.00 | 15.93 |
61
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 198.19 | 72.65 | \| | 5.07 | 13.80 |
62
+ | `['Add', 'MatMul']` | `[]` | `False` | \| | 191.44 | 63.27 | \| | 5.27 | 15.87 |
63
+ | `['Add', 'MatMul']` | `[]` | `True` | \| | 154.84 | 72.51 | \| | 6.47 | 13.80 |
64
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 155.84 | 130.95 | \| | 6.47 | 7.67 |
65
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 201.76 | 131.25 | \| | 5.00 | 7.67 |
66
+ | `['Add']` | `[]` | `False` | \| | 198.96 | 128.82 | \| | 5.07 | 7.80 |
67
+ | `['Add']` | `[]` | `True` | \| | 163.76 | 129.62 | \| | 6.13 | 7.73 |
68
+
69
+
70
+ Below, time metrics for batch size = 1, input length = 64.
71
+
72
+ | operators_to_quantize | node_exclusion | per_channel | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
73
+ | :-------------------: | :------------------------------------------------------: | :---------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
74
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 162.75 | 67.18 | \| | 6.20 | 14.93 |
75
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 159.69 | 72.77 | \| | 6.33 | 13.80 |
76
+ | `['Add', 'MatMul']` | `[]` | `False` | \| | 183.10 | 64.02 | \| | 5.47 | 15.67 |
77
+ | `['Add', 'MatMul']` | `[]` | `True` | \| | 157.21 | 64.16 | \| | 6.40 | 15.60 |
78
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 155.32 | 130.74 | \| | 6.47 | 7.67 |
79
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 198.56 | 162.51 | \| | 5.07 | 6.20 |
80
+ | `['Add']` | `[]` | `False` | \| | 186.58 | 163.38 | \| | 5.40 | 6.13 |
81
+ | `['Add']` | `[]` | `True` | \| | 199.75 | 131.46 | \| | 5.07 | 7.67 |
82
+
83
+
84
+ Below, time metrics for batch size = 1, input length = 128.
85
+
86
+ | operators_to_quantize | node_exclusion | per_channel | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
87
+ | :-------------------: | :------------------------------------------------------: | :---------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
88
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 160.58 | 67.65 | \| | 6.27 | 14.80 |
89
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 158.60 | 72.53 | \| | 6.33 | 13.80 |
90
+ | `['Add', 'MatMul']` | `[]` | `False` | \| | 200.46 | 62.95 | \| | 5.00 | 15.93 |
91
+ | `['Add', 'MatMul']` | `[]` | `True` | \| | 195.39 | 72.28 | \| | 5.13 | 13.87 |
92
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 197.59 | 128.80 | \| | 5.07 | 7.80 |
93
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 156.24 | 162.63 | \| | 6.47 | 6.20 |
94
+ | `['Add']` | `[]` | `False` | \| | 157.25 | 129.13 | \| | 6.40 | 7.80 |
95
+ | `['Add']` | `[]` | `True` | \| | 176.08 | 161.79 | \| | 5.73 | 6.20 |
96
+
97
+
98
+ Below, time metrics for batch size = 4, input length = 32.
99
+
100
+ | operators_to_quantize | node_exclusion | per_channel | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
101
+ | :-------------------: | :------------------------------------------------------: | :---------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
102
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 503.83 | 219.62 | \| | 2.00 | 4.60 |
103
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 603.26 | 266.15 | \| | 1.67 | 3.80 |
104
+ | `['Add', 'MatMul']` | `[]` | `False` | \| | 654.79 | 217.45 | \| | 1.53 | 4.60 |
105
+ | `['Add', 'MatMul']` | `[]` | `True` | \| | 654.33 | 219.54 | \| | 1.53 | 4.60 |
106
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 654.20 | 481.61 | \| | 1.53 | 2.13 |
107
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 609.81 | 632.73 | \| | 1.67 | 1.60 |
108
+ | `['Add']` | `[]` | `False` | \| | 588.86 | 602.91 | \| | 1.73 | 1.67 |
109
+ | `['Add']` | `[]` | `True` | \| | 666.98 | 655.32 | \| | 1.53 | 1.53 |
110
+
111
+
112
+ Below, time metrics for batch size = 4, input length = 64.
113
+
114
+ | operators_to_quantize | node_exclusion | per_channel | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
115
+ | :-------------------: | :------------------------------------------------------: | :---------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
116
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 656.87 | 216.32 | \| | 1.53 | 4.67 |
117
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 507.24 | 265.62 | \| | 2.00 | 3.80 |
118
+ | `['Add', 'MatMul']` | `[]` | `False` | \| | 655.36 | 219.61 | \| | 1.53 | 4.60 |
119
+ | `['Add', 'MatMul']` | `[]` | `True` | \| | 613.28 | 220.96 | \| | 1.67 | 4.53 |
120
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 656.30 | 652.72 | \| | 1.53 | 1.53 |
121
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 521.09 | 472.90 | \| | 1.93 | 2.13 |
122
+ | `['Add']` | `[]` | `False` | \| | 655.37 | 473.77 | \| | 1.53 | 2.13 |
123
+ | `['Add']` | `[]` | `True` | \| | 653.62 | 468.82 | \| | 1.53 | 2.13 |
124
+
125
+
126
+ Below, time metrics for batch size = 4, input length = 128.
127
+
128
+ | operators_to_quantize | node_exclusion | per_channel | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
129
+ | :-------------------: | :------------------------------------------------------: | :---------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
130
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 654.24 | 216.82 | \| | 1.53 | 4.67 |
131
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 657.16 | 240.11 | \| | 1.53 | 4.20 |
132
+ | `['Add', 'MatMul']` | `[]` | `False` | \| | 504.14 | 217.47 | \| | 2.00 | 4.60 |
133
+ | `['Add', 'MatMul']` | `[]` | `True` | \| | 655.94 | 220.12 | \| | 1.53 | 4.60 |
134
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 653.99 | 479.06 | \| | 1.53 | 2.13 |
135
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 642.48 | 666.28 | \| | 1.60 | 1.53 |
136
+ | `['Add']` | `[]` | `False` | \| | 656.34 | 661.24 | \| | 1.53 | 1.53 |
137
+ | `['Add']` | `[]` | `True` | \| | 661.86 | 472.49 | \| | 1.53 | 2.13 |
138
+
139
+
140
+ Below, time metrics for batch size = 8, input length = 32.
141
+
142
+ | operators_to_quantize | node_exclusion | per_channel | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
143
+ | :-------------------: | :------------------------------------------------------: | :---------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
144
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 1294.07 | 472.54 | \| | 0.80 | 2.13 |
145
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 1287.58 | 542.72 | \| | 0.80 | 1.87 |
146
+ | `['Add', 'MatMul']` | `[]` | `False` | \| | 1033.37 | 433.32 | \| | 1.00 | 2.33 |
147
+ | `['Add', 'MatMul']` | `[]` | `True` | \| | 1030.14 | 542.36 | \| | 1.00 | 1.87 |
148
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 953.27 | 926.14 | \| | 1.07 | 1.13 |
149
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 1173.01 | 995.22 | \| | 0.87 | 1.07 |
150
+ | `['Add']` | `[]` | `False` | \| | 1280.07 | 926.97 | \| | 0.80 | 1.13 |
151
+ | `['Add']` | `[]` | `True` | \| | 1283.70 | 927.87 | \| | 0.80 | 1.13 |
152
+
153
+
154
+ Below, time metrics for batch size = 8, input length = 64.
155
+
156
+ | operators_to_quantize | node_exclusion | per_channel | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
157
+ | :-------------------: | :------------------------------------------------------: | :---------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
158
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 1273.61 | 435.27 | \| | 0.80 | 2.33 |
159
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 1157.00 | 542.75 | \| | 0.87 | 1.87 |
160
+ | `['Add', 'MatMul']` | `[]` | `False` | \| | 968.85 | 537.65 | \| | 1.07 | 1.87 |
161
+ | `['Add', 'MatMul']` | `[]` | `True` | \| | 1107.66 | 472.53 | \| | 0.93 | 2.13 |
162
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 1270.30 | 1092.10 | \| | 0.80 | 0.93 |
163
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 1263.29 | 1012.66 | \| | 0.80 | 1.00 |
164
+ | `['Add']` | `[]` | `False` | \| | 1007.19 | 1331.12 | \| | 1.07 | 0.80 |
165
+ | `['Add']` | `[]` | `True` | \| | 1286.51 | 1317.96 | \| | 0.80 | 0.80 |
166
+
167
+
168
+ Below, time metrics for batch size = 8, input length = 128.
169
+
170
+ | operators_to_quantize | node_exclusion | per_channel | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
171
+ | :-------------------: | :------------------------------------------------------: | :---------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
172
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 1188.98 | 537.58 | \| | 0.87 | 1.87 |
173
+ | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 951.31 | 489.40 | \| | 1.07 | 2.07 |
174
+ | `['Add', 'MatMul']` | `[]` | `False` | \| | 1278.73 | 537.52 | \| | 0.80 | 1.87 |
175
+ | `['Add', 'MatMul']` | `[]` | `True` | \| | 1005.38 | 440.01 | \| | 1.07 | 2.33 |
176
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `False` | \| | 1265.55 | 1304.51 | \| | 0.80 | 0.80 |
177
+ | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | `True` | \| | 1186.54 | 934.09 | \| | 0.87 | 1.13 |
178
+ | `['Add']` | `[]` | `False` | \| | 1276.38 | 1319.84 | \| | 0.80 | 0.80 |
179
+ | `['Add']` | `[]` | `True` | \| | 981.81 | 940.69 | \| | 1.07 | 1.07 |
180
+
runs.json ADDED
The diff for this file is too large to render. See raw diff
 
tensorboard/1657641783.2675312/events.out.tfevents.1657641783.ip-10-0-93-8.ec2.internal.1.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c30dc121402d71aff4f3e8fe6b0a34c089a595d5cef41c922c58e7876bac26f0
3
+ size 815
tensorboard/1657641783.2690153/events.out.tfevents.1657641783.ip-10-0-93-8.ec2.internal.1.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:82d637a37ae208484298b5387020db8a62a004d7068b6649290cccaa9f5f324a
3
+ size 813
tensorboard/1657641783.2702107/events.out.tfevents.1657641783.ip-10-0-93-8.ec2.internal.1.3 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4ecd3fced430e48fd4eecd98e1da3cc27e3093099aaefc462e38c7c8cd055da2
3
+ size 763
tensorboard/1657641783.2713995/events.out.tfevents.1657641783.ip-10-0-93-8.ec2.internal.1.4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d401459e76332c75f2be3c899325b452dee12e403f70aea4cbe1f80d37e8d18f
3
+ size 762
tensorboard/1657641783.2726285/events.out.tfevents.1657641783.ip-10-0-93-8.ec2.internal.1.5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f035eac079ac2a747ebb9a9d1e4c7c8173ceb9a35d301d2c77fa5c171d5c8a35
3
+ size 806
tensorboard/1657641783.2737365/events.out.tfevents.1657641783.ip-10-0-93-8.ec2.internal.1.6 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:33fc18812ed9282a99360db0a48a3c8aaaf61c9048e7e4be7bfa4099746d8a0f
3
+ size 804
tensorboard/1657641783.2749662/events.out.tfevents.1657641783.ip-10-0-93-8.ec2.internal.1.7 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5f7b87620f4680251492c25b6f0e784719ee25968ccc930e5e1883eb52c4cd33
3
+ size 754
tensorboard/1657641783.2760458/events.out.tfevents.1657641783.ip-10-0-93-8.ec2.internal.1.8 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:af38d61be1e2c16ad75ce2baa44cada7c3a34d7ad9aac70259690b63e6ff3b88
3
+ size 751
tensorboard/events.out.tfevents.1657641783.ip-10-0-93-8.ec2.internal.1.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a9048ccb9e6b76b9f6843d5312bd2d004d60886e621524159ac428857f8d202a
3
+ size 40