add experience

7d060af over 2 years ago

24 kB

	---
	pipeline_tag: image-classification
	datasets:
	- beans
	metrics:
	- accuracy
	tags:
	- vit
	---

	task: `image-classification`
	Backend: `sagemaker-training`
	Backend args: `{'instance_type': 'ml.g4dn.2xlarge', 'supported_instructions': None}`
	Number of evaluation samples: `All dataset`

	Fixed parameters:
	* model_name_or_path: `nateraw/vit-base-beans`
	* dataset:
	* path: `beans`
	* eval_split: `validation`
	* data_keys: `{'primary': 'image'}`
	* ref_keys: `['labels']`
	* calibration_split: `train`
	* quantization_approach: `dynamic`
	* calibration:
	* method: `minmax`
	* num_calibration_samples: `100`
	* framework: `onnxruntime`
	* framework_args:
	* opset: `11`
	* optimization_level: `1`
	* aware_training: `False`

	Benchmarked parameters:
	* operators_to_quantize: `['Add']`, `['Add', 'MatMul']`
	* node_exclusion: `[]`, `['layernorm', 'gelu', 'residual', 'gather', 'softmax']`
	* per_channel: `False`, `True`

	# Evaluation
	## Non-time metrics
	\| operators_to_quantize \| node_exclusion \| per_channel \| \| accuracy (original) \| accuracy (optimized) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-----------------: \| :------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 0.980 \| 0.980 \|

	## Time metrics
	Time benchmarks were run for 15 seconds per config.


	Below, time metrics for batch size = 1, input length = 32.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 200.50 \| 63.00 \| \\| \| 5.00 \| 15.93 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 198.19 \| 72.65 \| \\| \| 5.07 \| 13.80 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 191.44 \| 63.27 \| \\| \| 5.27 \| 15.87 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 154.84 \| 72.51 \| \\| \| 6.47 \| 13.80 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 155.84 \| 130.95 \| \\| \| 6.47 \| 7.67 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 201.76 \| 131.25 \| \\| \| 5.00 \| 7.67 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 198.96 \| 128.82 \| \\| \| 5.07 \| 7.80 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 163.76 \| 129.62 \| \\| \| 6.13 \| 7.73 \|


	Below, time metrics for batch size = 1, input length = 64.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 162.75 \| 67.18 \| \\| \| 6.20 \| 14.93 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 159.69 \| 72.77 \| \\| \| 6.33 \| 13.80 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 183.10 \| 64.02 \| \\| \| 5.47 \| 15.67 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 157.21 \| 64.16 \| \\| \| 6.40 \| 15.60 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 155.32 \| 130.74 \| \\| \| 6.47 \| 7.67 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 198.56 \| 162.51 \| \\| \| 5.07 \| 6.20 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 186.58 \| 163.38 \| \\| \| 5.40 \| 6.13 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 199.75 \| 131.46 \| \\| \| 5.07 \| 7.67 \|


	Below, time metrics for batch size = 1, input length = 128.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 160.58 \| 67.65 \| \\| \| 6.27 \| 14.80 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 158.60 \| 72.53 \| \\| \| 6.33 \| 13.80 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 200.46 \| 62.95 \| \\| \| 5.00 \| 15.93 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 195.39 \| 72.28 \| \\| \| 5.13 \| 13.87 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 197.59 \| 128.80 \| \\| \| 5.07 \| 7.80 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 156.24 \| 162.63 \| \\| \| 6.47 \| 6.20 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 157.25 \| 129.13 \| \\| \| 6.40 \| 7.80 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 176.08 \| 161.79 \| \\| \| 5.73 \| 6.20 \|


	Below, time metrics for batch size = 4, input length = 32.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 503.83 \| 219.62 \| \\| \| 2.00 \| 4.60 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 603.26 \| 266.15 \| \\| \| 1.67 \| 3.80 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 654.79 \| 217.45 \| \\| \| 1.53 \| 4.60 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 654.33 \| 219.54 \| \\| \| 1.53 \| 4.60 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 654.20 \| 481.61 \| \\| \| 1.53 \| 2.13 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 609.81 \| 632.73 \| \\| \| 1.67 \| 1.60 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 588.86 \| 602.91 \| \\| \| 1.73 \| 1.67 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 666.98 \| 655.32 \| \\| \| 1.53 \| 1.53 \|


	Below, time metrics for batch size = 4, input length = 64.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 656.87 \| 216.32 \| \\| \| 1.53 \| 4.67 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 507.24 \| 265.62 \| \\| \| 2.00 \| 3.80 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 655.36 \| 219.61 \| \\| \| 1.53 \| 4.60 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 613.28 \| 220.96 \| \\| \| 1.67 \| 4.53 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 656.30 \| 652.72 \| \\| \| 1.53 \| 1.53 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 521.09 \| 472.90 \| \\| \| 1.93 \| 2.13 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 655.37 \| 473.77 \| \\| \| 1.53 \| 2.13 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 653.62 \| 468.82 \| \\| \| 1.53 \| 2.13 \|


	Below, time metrics for batch size = 4, input length = 128.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 654.24 \| 216.82 \| \\| \| 1.53 \| 4.67 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 657.16 \| 240.11 \| \\| \| 1.53 \| 4.20 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 504.14 \| 217.47 \| \\| \| 2.00 \| 4.60 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 655.94 \| 220.12 \| \\| \| 1.53 \| 4.60 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 653.99 \| 479.06 \| \\| \| 1.53 \| 2.13 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 642.48 \| 666.28 \| \\| \| 1.60 \| 1.53 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 656.34 \| 661.24 \| \\| \| 1.53 \| 1.53 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 661.86 \| 472.49 \| \\| \| 1.53 \| 2.13 \|


	Below, time metrics for batch size = 8, input length = 32.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 1294.07 \| 472.54 \| \\| \| 0.80 \| 2.13 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 1287.58 \| 542.72 \| \\| \| 0.80 \| 1.87 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 1033.37 \| 433.32 \| \\| \| 1.00 \| 2.33 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 1030.14 \| 542.36 \| \\| \| 1.00 \| 1.87 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 953.27 \| 926.14 \| \\| \| 1.07 \| 1.13 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 1173.01 \| 995.22 \| \\| \| 0.87 \| 1.07 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 1280.07 \| 926.97 \| \\| \| 0.80 \| 1.13 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 1283.70 \| 927.87 \| \\| \| 0.80 \| 1.13 \|


	Below, time metrics for batch size = 8, input length = 64.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 1273.61 \| 435.27 \| \\| \| 0.80 \| 2.33 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 1157.00 \| 542.75 \| \\| \| 0.87 \| 1.87 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 968.85 \| 537.65 \| \\| \| 1.07 \| 1.87 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 1107.66 \| 472.53 \| \\| \| 0.93 \| 2.13 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 1270.30 \| 1092.10 \| \\| \| 0.80 \| 0.93 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 1263.29 \| 1012.66 \| \\| \| 0.80 \| 1.00 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 1007.19 \| 1331.12 \| \\| \| 1.07 \| 0.80 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 1286.51 \| 1317.96 \| \\| \| 0.80 \| 0.80 \|


	Below, time metrics for batch size = 8, input length = 128.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 1188.98 \| 537.58 \| \\| \| 0.87 \| 1.87 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 951.31 \| 489.40 \| \\| \| 1.07 \| 2.07 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 1278.73 \| 537.52 \| \\| \| 0.80 \| 1.87 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 1005.38 \| 440.01 \| \\| \| 1.07 \| 2.33 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 1265.55 \| 1304.51 \| \\| \| 0.80 \| 0.80 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 1186.54 \| 934.09 \| \\| \| 0.87 \| 1.13 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 1276.38 \| 1319.84 \| \\| \| 0.80 \| 0.80 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 981.81 \| 940.69 \| \\| \| 1.07 \| 1.07 \|

	---
	pipeline_tag: image-classification
	datasets:
	- beans
	metrics:
	- accuracy
	tags:
	- vit
	---

	task: `image-classification`
	Backend: `sagemaker-training`
	Backend args: `{'instance_type': 'ml.g4dn.2xlarge', 'supported_instructions': None}`
	Number of evaluation samples: `All dataset`

	Fixed parameters:
	* model_name_or_path: `nateraw/vit-base-beans`
	* dataset:
	* path: `beans`
	* eval_split: `validation`
	* data_keys: `{'primary': 'image'}`
	* ref_keys: `['labels']`
	* calibration_split: `train`
	* quantization_approach: `dynamic`
	* calibration:
	* method: `minmax`
	* num_calibration_samples: `100`
	* framework: `onnxruntime`
	* framework_args:
	* opset: `11`
	* optimization_level: `1`
	* aware_training: `False`

	Benchmarked parameters:
	* operators_to_quantize: `['Add']`, `['Add', 'MatMul']`
	* node_exclusion: `[]`, `['layernorm', 'gelu', 'residual', 'gather', 'softmax']`
	* per_channel: `False`, `True`

	# Evaluation
	## Non-time metrics
	\| operators_to_quantize \| node_exclusion \| per_channel \| \| accuracy (original) \| accuracy (optimized) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-----------------: \| :------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 0.980 \| 0.980 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 0.980 \| 0.980 \|

	## Time metrics
	Time benchmarks were run for 15 seconds per config.


	Below, time metrics for batch size = 1, input length = 32.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 200.50 \| 63.00 \| \\| \| 5.00 \| 15.93 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 198.19 \| 72.65 \| \\| \| 5.07 \| 13.80 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 191.44 \| 63.27 \| \\| \| 5.27 \| 15.87 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 154.84 \| 72.51 \| \\| \| 6.47 \| 13.80 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 155.84 \| 130.95 \| \\| \| 6.47 \| 7.67 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 201.76 \| 131.25 \| \\| \| 5.00 \| 7.67 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 198.96 \| 128.82 \| \\| \| 5.07 \| 7.80 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 163.76 \| 129.62 \| \\| \| 6.13 \| 7.73 \|


	Below, time metrics for batch size = 1, input length = 64.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 162.75 \| 67.18 \| \\| \| 6.20 \| 14.93 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 159.69 \| 72.77 \| \\| \| 6.33 \| 13.80 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 183.10 \| 64.02 \| \\| \| 5.47 \| 15.67 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 157.21 \| 64.16 \| \\| \| 6.40 \| 15.60 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 155.32 \| 130.74 \| \\| \| 6.47 \| 7.67 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 198.56 \| 162.51 \| \\| \| 5.07 \| 6.20 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 186.58 \| 163.38 \| \\| \| 5.40 \| 6.13 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 199.75 \| 131.46 \| \\| \| 5.07 \| 7.67 \|


	Below, time metrics for batch size = 1, input length = 128.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 160.58 \| 67.65 \| \\| \| 6.27 \| 14.80 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 158.60 \| 72.53 \| \\| \| 6.33 \| 13.80 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 200.46 \| 62.95 \| \\| \| 5.00 \| 15.93 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 195.39 \| 72.28 \| \\| \| 5.13 \| 13.87 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 197.59 \| 128.80 \| \\| \| 5.07 \| 7.80 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 156.24 \| 162.63 \| \\| \| 6.47 \| 6.20 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 157.25 \| 129.13 \| \\| \| 6.40 \| 7.80 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 176.08 \| 161.79 \| \\| \| 5.73 \| 6.20 \|


	Below, time metrics for batch size = 4, input length = 32.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 503.83 \| 219.62 \| \\| \| 2.00 \| 4.60 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 603.26 \| 266.15 \| \\| \| 1.67 \| 3.80 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 654.79 \| 217.45 \| \\| \| 1.53 \| 4.60 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 654.33 \| 219.54 \| \\| \| 1.53 \| 4.60 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 654.20 \| 481.61 \| \\| \| 1.53 \| 2.13 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 609.81 \| 632.73 \| \\| \| 1.67 \| 1.60 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 588.86 \| 602.91 \| \\| \| 1.73 \| 1.67 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 666.98 \| 655.32 \| \\| \| 1.53 \| 1.53 \|


	Below, time metrics for batch size = 4, input length = 64.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 656.87 \| 216.32 \| \\| \| 1.53 \| 4.67 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 507.24 \| 265.62 \| \\| \| 2.00 \| 3.80 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 655.36 \| 219.61 \| \\| \| 1.53 \| 4.60 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 613.28 \| 220.96 \| \\| \| 1.67 \| 4.53 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 656.30 \| 652.72 \| \\| \| 1.53 \| 1.53 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 521.09 \| 472.90 \| \\| \| 1.93 \| 2.13 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 655.37 \| 473.77 \| \\| \| 1.53 \| 2.13 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 653.62 \| 468.82 \| \\| \| 1.53 \| 2.13 \|


	Below, time metrics for batch size = 4, input length = 128.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 654.24 \| 216.82 \| \\| \| 1.53 \| 4.67 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 657.16 \| 240.11 \| \\| \| 1.53 \| 4.20 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 504.14 \| 217.47 \| \\| \| 2.00 \| 4.60 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 655.94 \| 220.12 \| \\| \| 1.53 \| 4.60 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 653.99 \| 479.06 \| \\| \| 1.53 \| 2.13 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 642.48 \| 666.28 \| \\| \| 1.60 \| 1.53 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 656.34 \| 661.24 \| \\| \| 1.53 \| 1.53 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 661.86 \| 472.49 \| \\| \| 1.53 \| 2.13 \|


	Below, time metrics for batch size = 8, input length = 32.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 1294.07 \| 472.54 \| \\| \| 0.80 \| 2.13 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 1287.58 \| 542.72 \| \\| \| 0.80 \| 1.87 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 1033.37 \| 433.32 \| \\| \| 1.00 \| 2.33 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 1030.14 \| 542.36 \| \\| \| 1.00 \| 1.87 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 953.27 \| 926.14 \| \\| \| 1.07 \| 1.13 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 1173.01 \| 995.22 \| \\| \| 0.87 \| 1.07 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 1280.07 \| 926.97 \| \\| \| 0.80 \| 1.13 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 1283.70 \| 927.87 \| \\| \| 0.80 \| 1.13 \|


	Below, time metrics for batch size = 8, input length = 64.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 1273.61 \| 435.27 \| \\| \| 0.80 \| 2.33 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 1157.00 \| 542.75 \| \\| \| 0.87 \| 1.87 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 968.85 \| 537.65 \| \\| \| 1.07 \| 1.87 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 1107.66 \| 472.53 \| \\| \| 0.93 \| 2.13 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 1270.30 \| 1092.10 \| \\| \| 0.80 \| 0.93 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 1263.29 \| 1012.66 \| \\| \| 0.80 \| 1.00 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 1007.19 \| 1331.12 \| \\| \| 1.07 \| 0.80 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 1286.51 \| 1317.96 \| \\| \| 0.80 \| 0.80 \|


	Below, time metrics for batch size = 8, input length = 128.

	\| operators_to_quantize \| node_exclusion \| per_channel \| \| latency_mean (original, ms) \| latency_mean (optimized, ms) \| \| throughput (original, /s) \| throughput (optimized, /s) \|
	\| :-------------------: \| :------------------------------------------------------: \| :---------: \| :-: \| :-------------------------: \| :--------------------------: \| :-: \| :-----------------------: \| :------------------------: \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 1188.98 \| 537.58 \| \\| \| 0.87 \| 1.87 \|
	\| `['Add', 'MatMul']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 951.31 \| 489.40 \| \\| \| 1.07 \| 2.07 \|
	\| `['Add', 'MatMul']` \| `[]` \| `False` \| \\| \| 1278.73 \| 537.52 \| \\| \| 0.80 \| 1.87 \|
	\| `['Add', 'MatMul']` \| `[]` \| `True` \| \\| \| 1005.38 \| 440.01 \| \\| \| 1.07 \| 2.33 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `False` \| \\| \| 1265.55 \| 1304.51 \| \\| \| 0.80 \| 0.80 \|
	\| `['Add']` \| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` \| `True` \| \\| \| 1186.54 \| 934.09 \| \\| \| 0.87 \| 1.13 \|
	\| `['Add']` \| `[]` \| `False` \| \\| \| 1276.38 \| 1319.84 \| \\| \| 0.80 \| 0.80 \|
	\| `['Add']` \| `[]` \| `True` \| \\| \| 981.81 \| 940.69 \| \\| \| 1.07 \| 1.07 \|