fxmarty commited on
Commit
7c7d8a4
1 Parent(s): fd3005b

add experience

Browse files
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: token-classification
3
+ datasets:
4
+ - conll2003
5
+ metrics:
6
+ - precision
7
+ - recall
8
+ - f1
9
+ - accuracy
10
+ tags:
11
+ - distilbert
12
+ ---
13
+
14
+ **task**: `token-classification`
15
+ **Backend:** `sagemaker-training`
16
+ **Backend args:** `{'instance_type': 'ml.g4dn.2xlarge', 'supported_instructions': None}`
17
+ **Number of evaluation samples:** `All dataset`
18
+
19
+ Fixed parameters:
20
+ * **model_name_or_path**: `elastic/distilbert-base-uncased-finetuned-conll03-english`
21
+ * **dataset**:
22
+ * **path**: `conll2003`
23
+ * **eval_split**: `validation`
24
+ * **data_keys**: `{'primary': 'tokens'}`
25
+ * **ref_keys**: `['ner_tags']`
26
+ * **calibration_split**: `train`
27
+ * **quantization_approach**: `static`
28
+ * **operators_to_quantize**: `['Add', 'MatMul']`
29
+ * **per_channel**: `False`
30
+ * **calibration**:
31
+ * **method**: `minmax`
32
+ * **num_calibration_samples**: `100`
33
+ * **framework**: `onnxruntime`
34
+ * **framework_args**:
35
+ * **opset**: `11`
36
+ * **optimization_level**: `1`
37
+ * **aware_training**: `False`
38
+
39
+ Benchmarked parameters:
40
+ * **node_exclusion**: `[]`, `['layernorm', 'gelu', 'residual', 'gather', 'softmax']`
41
+
42
+ # Evaluation
43
+ ## Non-time metrics
44
+ | node_exclusion | | precision (original) | precision (optimized) | | recall (original) | recall (optimized) | | f1 (original) | f1 (optimized) | | accuracy (original) | accuracy (optimized) |
45
+ | :------------------------------------------------------: | :-: | :------------------: | :-------------------: | :-: | :---------------: | :----------------: | :-: | :-----------: | :------------: | :-: | :-----------------: | :------------------: |
46
+ | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 0.936 | 0.904 | \| | 0.944 | 0.921 | \| | 0.940 | 0.912 | \| | 0.988 | 0.984 |
47
+ | `[]` | \| | 0.936 | 0.065 | \| | 0.944 | 0.243 | \| | 0.940 | 0.103 | \| | 0.988 | 0.357 |
48
+
49
+ ## Time metrics
50
+ Time benchmarks were run for 15 seconds per config.
51
+
52
+
53
+ Below, time metrics for batch size = 4, input length = 64.
54
+
55
+ | node_exclusion | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
56
+ | :------------------------------------------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
57
+ | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 120.53 | 46.41 | \| | 8.33 | 21.60 |
58
+ | `[]` | \| | 119.97 | 59.50 | \| | 8.40 | 16.87 |
59
+
runs.json ADDED
@@ -0,0 +1,196 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "model_name_or_path": "elastic/distilbert-base-uncased-finetuned-conll03-english",
4
+ "task": "token-classification",
5
+ "task_args": null,
6
+ "dataset": {
7
+ "path": "conll2003",
8
+ "eval_split": "validation",
9
+ "data_keys": {
10
+ "primary": "tokens",
11
+ "secondary": null
12
+ },
13
+ "ref_keys": [
14
+ "ner_tags"
15
+ ],
16
+ "name": null,
17
+ "calibration_split": "train"
18
+ },
19
+ "quantization_approach": "static",
20
+ "operators_to_quantize": [
21
+ "Add",
22
+ "MatMul"
23
+ ],
24
+ "node_exclusion": [
25
+ "layernorm",
26
+ "gelu",
27
+ "residual",
28
+ "gather",
29
+ "softmax"
30
+ ],
31
+ "aware_training": false,
32
+ "per_channel": false,
33
+ "calibration": {
34
+ "method": "minmax",
35
+ "num_calibration_samples": 100,
36
+ "calibration_histogram_percentile": null,
37
+ "calibration_moving_average": null,
38
+ "calibration_moving_average_constant": null
39
+ },
40
+ "framework": "onnxruntime",
41
+ "framework_args": {
42
+ "opset": 11,
43
+ "optimization_level": 1
44
+ },
45
+ "hardware": "Architecture: x86_64\nCPU op-mode(s): 32-bit, 64-bit\nByte Order: Little Endian\nAddress sizes: 46 bits physical, 48 bits virtual\nCPU(s): 8\nOn-line CPU(s) list: 0-7\nThread(s) per core: 2\nCore(s) per socket: 4\nSocket(s): 1\nNUMA node(s): 1\nVendor ID: GenuineIntel\nCPU family: 6\nModel: 85\nModel name: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz\nStepping: 7\nCPU MHz: 3099.967\nBogoMIPS: 4999.99\nHypervisor vendor: KVM\nVirtualization type: full\nL1d cache: 128 KiB\nL1i cache: 128 KiB\nL2 cache: 4 MiB\nL3 cache: 35.8 MiB\nNUMA node0 CPU(s): 0-7\nVulnerability Itlb multihit: KVM: Vulnerable\nVulnerability L1tf: Mitigation; PTE Inversion\nVulnerability Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown\nVulnerability Meltdown: Mitigation; PTI\nVulnerability Spec store bypass: Vulnerable\nVulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization\nVulnerability Spectre v2: Mitigation; Retpolines, STIBP disabled, RSB filling\nVulnerability Srbds: Not affected\nVulnerability Tsx async abort: Not affected\nFlags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke avx512_vnni\n",
46
+ "versions": {
47
+ "transformers": "4.20.1",
48
+ "optimum": "1.2.3.dev0",
49
+ "optimum_hash": "5c9af4e5f93c7e9bd523563230732b49603dc4d7"
50
+ },
51
+ "evaluation": {
52
+ "time": [
53
+ {
54
+ "batch_size": 4,
55
+ "input_length": 64,
56
+ "baseline": {
57
+ "nb_forwards": 125,
58
+ "throughput": 8.33,
59
+ "latency_mean": 120.533118016,
60
+ "latency_std": 0.8359515558530887,
61
+ "latency_50": 120.363805,
62
+ "latency_90": 121.5990676,
63
+ "latency_95": 121.9226278,
64
+ "latency_99": 123.02248340000001,
65
+ "latency_999": 123.981240332
66
+ },
67
+ "optimized": {
68
+ "nb_forwards": 324,
69
+ "throughput": 21.6,
70
+ "latency_mean": 46.41411868827161,
71
+ "latency_std": 1.7434868124538356,
72
+ "latency_50": 46.258802,
73
+ "latency_90": 48.954860200000006,
74
+ "latency_95": 49.46207725,
75
+ "latency_99": 50.02644861,
76
+ "latency_999": 50.308738387000005
77
+ }
78
+ }
79
+ ],
80
+ "others": {
81
+ "baseline": {
82
+ "precision": 0.9358012339503085,
83
+ "recall": 0.9444631437226523,
84
+ "f1": 0.9401122372057961,
85
+ "accuracy": 0.9882013940267124
86
+ },
87
+ "optimized": {
88
+ "precision": 0.9038969616908851,
89
+ "recall": 0.9212386401884888,
90
+ "f1": 0.912485414235706,
91
+ "accuracy": 0.9842295860753086
92
+ }
93
+ }
94
+ },
95
+ "max_eval_samples": null,
96
+ "time_benchmark_args": {
97
+ "duration": 15,
98
+ "warmup_runs": 5
99
+ },
100
+ "model_type": "distilbert"
101
+ },
102
+ {
103
+ "model_name_or_path": "elastic/distilbert-base-uncased-finetuned-conll03-english",
104
+ "task": "token-classification",
105
+ "task_args": null,
106
+ "dataset": {
107
+ "path": "conll2003",
108
+ "eval_split": "validation",
109
+ "data_keys": {
110
+ "primary": "tokens",
111
+ "secondary": null
112
+ },
113
+ "ref_keys": [
114
+ "ner_tags"
115
+ ],
116
+ "name": null,
117
+ "calibration_split": "train"
118
+ },
119
+ "quantization_approach": "static",
120
+ "operators_to_quantize": [
121
+ "Add",
122
+ "MatMul"
123
+ ],
124
+ "node_exclusion": [],
125
+ "aware_training": false,
126
+ "per_channel": false,
127
+ "calibration": {
128
+ "method": "minmax",
129
+ "num_calibration_samples": 100,
130
+ "calibration_histogram_percentile": null,
131
+ "calibration_moving_average": null,
132
+ "calibration_moving_average_constant": null
133
+ },
134
+ "framework": "onnxruntime",
135
+ "framework_args": {
136
+ "opset": 11,
137
+ "optimization_level": 1
138
+ },
139
+ "hardware": "Architecture: x86_64\nCPU op-mode(s): 32-bit, 64-bit\nByte Order: Little Endian\nAddress sizes: 46 bits physical, 48 bits virtual\nCPU(s): 8\nOn-line CPU(s) list: 0-7\nThread(s) per core: 2\nCore(s) per socket: 4\nSocket(s): 1\nNUMA node(s): 1\nVendor ID: GenuineIntel\nCPU family: 6\nModel: 85\nModel name: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz\nStepping: 7\nCPU MHz: 3100.910\nBogoMIPS: 4999.99\nHypervisor vendor: KVM\nVirtualization type: full\nL1d cache: 128 KiB\nL1i cache: 128 KiB\nL2 cache: 4 MiB\nL3 cache: 35.8 MiB\nNUMA node0 CPU(s): 0-7\nVulnerability Itlb multihit: KVM: Vulnerable\nVulnerability L1tf: Mitigation; PTE Inversion\nVulnerability Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown\nVulnerability Meltdown: Mitigation; PTI\nVulnerability Spec store bypass: Vulnerable\nVulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization\nVulnerability Spectre v2: Mitigation; Retpolines, STIBP disabled, RSB filling\nVulnerability Srbds: Not affected\nVulnerability Tsx async abort: Not affected\nFlags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke avx512_vnni\n",
140
+ "versions": {
141
+ "transformers": "4.20.1",
142
+ "optimum": "1.2.3.dev0",
143
+ "optimum_hash": "5c9af4e5f93c7e9bd523563230732b49603dc4d7"
144
+ },
145
+ "evaluation": {
146
+ "time": [
147
+ {
148
+ "batch_size": 4,
149
+ "input_length": 64,
150
+ "baseline": {
151
+ "nb_forwards": 126,
152
+ "throughput": 8.4,
153
+ "latency_mean": 119.97364652380952,
154
+ "latency_std": 1.1115995809677575,
155
+ "latency_50": 119.8484215,
156
+ "latency_90": 121.1081755,
157
+ "latency_95": 121.99861425,
158
+ "latency_99": 122.6797695,
159
+ "latency_999": 124.28595525
160
+ },
161
+ "optimized": {
162
+ "nb_forwards": 253,
163
+ "throughput": 16.87,
164
+ "latency_mean": 59.49775931225296,
165
+ "latency_std": 3.518559570257517,
166
+ "latency_50": 58.504581,
167
+ "latency_90": 64.1940764,
168
+ "latency_95": 66.06759079999999,
169
+ "latency_99": 68.27611223999999,
170
+ "latency_999": 69.44462915599999
171
+ }
172
+ }
173
+ ],
174
+ "others": {
175
+ "baseline": {
176
+ "precision": 0.9358012339503085,
177
+ "recall": 0.9444631437226523,
178
+ "f1": 0.9401122372057961,
179
+ "accuracy": 0.9882013940267124
180
+ },
181
+ "optimized": {
182
+ "precision": 0.06543578604398588,
183
+ "recall": 0.24335240659710536,
184
+ "f1": 0.10313837375178317,
185
+ "accuracy": 0.35697597445582335
186
+ }
187
+ }
188
+ },
189
+ "max_eval_samples": null,
190
+ "time_benchmark_args": {
191
+ "duration": 15,
192
+ "warmup_runs": 5
193
+ },
194
+ "model_type": "distilbert"
195
+ }
196
+ ]
tensorboard/1657707611.2962544/events.out.tfevents.1657707611.ip-10-2-64-206.ec2.internal.1.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ce86125e8a636be39bff10bff27849d2a71abd354b6c809d87f862403119453
3
+ size 696
tensorboard/1657707611.2976747/events.out.tfevents.1657707611.ip-10-2-64-206.ec2.internal.1.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:38bb5da1b600d8842d3d85a2b2104775dd0f3b1beddbfd60fe8de0f5c5522440
3
+ size 644
tensorboard/events.out.tfevents.1657707611.ip-10-2-64-206.ec2.internal.1.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5315e44ede4671066360385cecf9da4cae04e4f04d0d27849665f6a042e0969d
3
+ size 40