BoDong commited on
Commit
4d03cd5
1 Parent(s): d62a5ce

First model version

Browse files
Neural_Engine_INT8_IR/conf.yaml ADDED
@@ -0,0 +1,2299 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ model:
2
+ name: model
3
+ operator:
4
+ input_data:
5
+ type: Input
6
+ output:
7
+ input_ids:0:
8
+ dtype: int32
9
+ shape: [-1, -1]
10
+ segment_ids:0:
11
+ dtype: int32
12
+ shape: [-1, -1]
13
+ input_mask:0:
14
+ dtype: int32
15
+ shape: [-1, -1]
16
+ bert.embeddings.position_embeddings.weight:0:
17
+ dtype: fp32
18
+ shape: [512, 256]
19
+ location: [0, 524288]
20
+ bert.embeddings.word_embeddings.weight:0:
21
+ dtype: fp32
22
+ shape: [30522, 256]
23
+ location: [524288, 31254528]
24
+ bert.embeddings.token_type_embeddings.weight:0:
25
+ dtype: fp32
26
+ shape: [2, 256]
27
+ location: [31778816, 2048]
28
+ bert.embeddings.LayerNorm.weight:0:
29
+ dtype: fp32
30
+ shape: [256]
31
+ location: [31780864, 1024]
32
+ bert.embeddings.LayerNorm.bias:0:
33
+ dtype: fp32
34
+ shape: [256]
35
+ location: [31781888, 1024]
36
+ 111:0_min:
37
+ dtype: fp32
38
+ shape: [1]
39
+ location: [31782912, 4]
40
+ 111:0_max:
41
+ dtype: fp32
42
+ shape: [1]
43
+ location: [31782916, 4]
44
+ '576:0':
45
+ dtype: s8
46
+ shape: [256, 256]
47
+ location: [31782920, 65536]
48
+ bert.encoder.layer.0.attention.self.key.bias:0:
49
+ dtype: s32
50
+ shape: [256]
51
+ location: [31848456, 1024]
52
+ 111:0_quant_min:
53
+ dtype: fp32
54
+ shape: [1]
55
+ location: [31988776, 4]
56
+ 111:0_quant_max:
57
+ dtype: fp32
58
+ shape: [1]
59
+ location: [31988780, 4]
60
+ 576:0_min:
61
+ dtype: fp32
62
+ shape: [256]
63
+ location: [31849480, 1024]
64
+ 576:0_max:
65
+ dtype: fp32
66
+ shape: [256]
67
+ location: [31850504, 1024]
68
+ Add_34:0_min:
69
+ dtype: fp32
70
+ shape: [1]
71
+ location: [31851536, 4]
72
+ Add_34:0_max:
73
+ dtype: fp32
74
+ shape: [1]
75
+ location: [31851540, 4]
76
+ '579:0':
77
+ dtype: s8
78
+ shape: [256, 256]
79
+ location: [31851544, 65536]
80
+ bert.encoder.layer.0.attention.self.value.bias:0:
81
+ dtype: s32
82
+ shape: [256]
83
+ location: [31917080, 1024]
84
+ 579:0_min:
85
+ dtype: fp32
86
+ shape: [256]
87
+ location: [31918104, 1024]
88
+ 579:0_max:
89
+ dtype: fp32
90
+ shape: [256]
91
+ location: [31919128, 1024]
92
+ Add_46:0_min:
93
+ dtype: fp32
94
+ shape: [1]
95
+ location: [31920160, 4]
96
+ Add_46:0_max:
97
+ dtype: fp32
98
+ shape: [1]
99
+ location: [31920164, 4]
100
+ '575:0':
101
+ dtype: s8
102
+ shape: [256, 256]
103
+ location: [31920168, 65536]
104
+ bert.encoder.layer.0.attention.self.query.bias:0:
105
+ dtype: s32
106
+ shape: [256]
107
+ location: [31985704, 1024]
108
+ 575:0_min:
109
+ dtype: fp32
110
+ shape: [256]
111
+ location: [31986728, 1024]
112
+ 575:0_max:
113
+ dtype: fp32
114
+ shape: [256]
115
+ location: [31987752, 1024]
116
+ Add_32:0_min:
117
+ dtype: fp32
118
+ shape: [1]
119
+ location: [31988784, 4]
120
+ Add_32:0_max:
121
+ dtype: fp32
122
+ shape: [1]
123
+ location: [31988788, 4]
124
+ 163:0_quant_min:
125
+ dtype: fp32
126
+ shape: [1]
127
+ location: [31988792, 4]
128
+ 163:0_quant_max:
129
+ dtype: fp32
130
+ shape: [1]
131
+ location: [31988796, 4]
132
+ 131:0_quant_min:
133
+ dtype: fp32
134
+ shape: [1]
135
+ location: [31988800, 4]
136
+ 131:0_quant_max:
137
+ dtype: fp32
138
+ shape: [1]
139
+ location: [31988804, 4]
140
+ 169:0_min:
141
+ dtype: fp32
142
+ shape: [1]
143
+ location: [31988808, 4]
144
+ 169:0_max:
145
+ dtype: fp32
146
+ shape: [1]
147
+ location: [31988812, 4]
148
+ 170:0_quant_min:
149
+ dtype: fp32
150
+ shape: [1]
151
+ location: [31988824, 4]
152
+ 170:0_quant_max:
153
+ dtype: fp32
154
+ shape: [1]
155
+ location: [31988828, 4]
156
+ 148:0_quant_min:
157
+ dtype: fp32
158
+ shape: [1]
159
+ location: [31988832, 4]
160
+ 148:0_quant_max:
161
+ dtype: fp32
162
+ shape: [1]
163
+ location: [31988836, 4]
164
+ 172:0_min:
165
+ dtype: fp32
166
+ shape: [1]
167
+ location: [31988840, 4]
168
+ 172:0_max:
169
+ dtype: fp32
170
+ shape: [1]
171
+ location: [31988844, 4]
172
+ '585:0':
173
+ dtype: s8
174
+ shape: [256, 256]
175
+ location: [31988848, 65536]
176
+ bert.encoder.layer.0.attention.output.dense.bias:0:
177
+ dtype: s32
178
+ shape: [256]
179
+ location: [32054384, 1024]
180
+ 184:0_quant_min:
181
+ dtype: fp32
182
+ shape: [1]
183
+ location: [32057456, 4]
184
+ 184:0_quant_max:
185
+ dtype: fp32
186
+ shape: [1]
187
+ location: [32057460, 4]
188
+ 585:0_min:
189
+ dtype: fp32
190
+ shape: [256]
191
+ location: [32055408, 1024]
192
+ 585:0_max:
193
+ dtype: fp32
194
+ shape: [256]
195
+ location: [32056432, 1024]
196
+ 188:0_min:
197
+ dtype: fp32
198
+ shape: [1]
199
+ location: [32057464, 4]
200
+ 188:0_max:
201
+ dtype: fp32
202
+ shape: [1]
203
+ location: [32057468, 4]
204
+ bert.encoder.layer.0.attention.output.LayerNorm.weight:0:
205
+ dtype: fp32
206
+ shape: [256]
207
+ location: [32057472, 1024]
208
+ bert.encoder.layer.0.attention.output.LayerNorm.bias:0:
209
+ dtype: fp32
210
+ shape: [256]
211
+ location: [32058496, 1024]
212
+ 199:0_min:
213
+ dtype: fp32
214
+ shape: [1]
215
+ location: [32059520, 4]
216
+ 199:0_max:
217
+ dtype: fp32
218
+ shape: [1]
219
+ location: [32059524, 4]
220
+ '586:0':
221
+ dtype: s8
222
+ shape: [1024, 256]
223
+ location: [32059528, 262144]
224
+ bert.encoder.layer.0.intermediate.dense.bias:0:
225
+ dtype: s32
226
+ shape: [1024]
227
+ location: [32321672, 4096]
228
+ 199:0_quant_min:
229
+ dtype: fp32
230
+ shape: [1]
231
+ location: [32333960, 4]
232
+ 199:0_quant_max:
233
+ dtype: fp32
234
+ shape: [1]
235
+ location: [32333964, 4]
236
+ 586:0_min:
237
+ dtype: fp32
238
+ shape: [1024]
239
+ location: [32325768, 4096]
240
+ 586:0_max:
241
+ dtype: fp32
242
+ shape: [1024]
243
+ location: [32329864, 4096]
244
+ 210:0_quant_min:
245
+ dtype: fp32
246
+ shape: [1]
247
+ location: [32599200, 4]
248
+ 210:0_quant_max:
249
+ dtype: fp32
250
+ shape: [1]
251
+ location: [32599204, 4]
252
+ '587:0':
253
+ dtype: s8
254
+ shape: [256, 1024]
255
+ location: [32333984, 262144]
256
+ bert.encoder.layer.0.output.dense.bias:0:
257
+ dtype: s32
258
+ shape: [256]
259
+ location: [32596128, 1024]
260
+ 587:0_min:
261
+ dtype: fp32
262
+ shape: [256]
263
+ location: [32597152, 1024]
264
+ 587:0_max:
265
+ dtype: fp32
266
+ shape: [256]
267
+ location: [32598176, 1024]
268
+ 214:0_min:
269
+ dtype: fp32
270
+ shape: [1]
271
+ location: [32599208, 4]
272
+ 214:0_max:
273
+ dtype: fp32
274
+ shape: [1]
275
+ location: [32599212, 4]
276
+ bert.encoder.layer.0.output.LayerNorm.weight:0:
277
+ dtype: fp32
278
+ shape: [256]
279
+ location: [32599216, 1024]
280
+ bert.encoder.layer.0.output.LayerNorm.bias:0:
281
+ dtype: fp32
282
+ shape: [256]
283
+ location: [32600240, 1024]
284
+ 225:0_min:
285
+ dtype: fp32
286
+ shape: [1]
287
+ location: [32601264, 4]
288
+ 225:0_max:
289
+ dtype: fp32
290
+ shape: [1]
291
+ location: [32601268, 4]
292
+ '589:0':
293
+ dtype: s8
294
+ shape: [256, 256]
295
+ location: [32601272, 65536]
296
+ bert.encoder.layer.1.attention.self.key.bias:0:
297
+ dtype: s32
298
+ shape: [256]
299
+ location: [32666808, 1024]
300
+ 225:0_quant_min:
301
+ dtype: fp32
302
+ shape: [1]
303
+ location: [32807128, 4]
304
+ 225:0_quant_max:
305
+ dtype: fp32
306
+ shape: [1]
307
+ location: [32807132, 4]
308
+ 589:0_min:
309
+ dtype: fp32
310
+ shape: [256]
311
+ location: [32667832, 1024]
312
+ 589:0_max:
313
+ dtype: fp32
314
+ shape: [256]
315
+ location: [32668856, 1024]
316
+ Add_128:0_min:
317
+ dtype: fp32
318
+ shape: [1]
319
+ location: [32669888, 4]
320
+ Add_128:0_max:
321
+ dtype: fp32
322
+ shape: [1]
323
+ location: [32669892, 4]
324
+ '592:0':
325
+ dtype: s8
326
+ shape: [256, 256]
327
+ location: [32669896, 65536]
328
+ bert.encoder.layer.1.attention.self.value.bias:0:
329
+ dtype: s32
330
+ shape: [256]
331
+ location: [32735432, 1024]
332
+ 592:0_min:
333
+ dtype: fp32
334
+ shape: [256]
335
+ location: [32736456, 1024]
336
+ 592:0_max:
337
+ dtype: fp32
338
+ shape: [256]
339
+ location: [32737480, 1024]
340
+ Add_140:0_min:
341
+ dtype: fp32
342
+ shape: [1]
343
+ location: [32738512, 4]
344
+ Add_140:0_max:
345
+ dtype: fp32
346
+ shape: [1]
347
+ location: [32738516, 4]
348
+ '588:0':
349
+ dtype: s8
350
+ shape: [256, 256]
351
+ location: [32738520, 65536]
352
+ bert.encoder.layer.1.attention.self.query.bias:0:
353
+ dtype: s32
354
+ shape: [256]
355
+ location: [32804056, 1024]
356
+ 588:0_min:
357
+ dtype: fp32
358
+ shape: [256]
359
+ location: [32805080, 1024]
360
+ 588:0_max:
361
+ dtype: fp32
362
+ shape: [256]
363
+ location: [32806104, 1024]
364
+ Add_126:0_min:
365
+ dtype: fp32
366
+ shape: [1]
367
+ location: [32807136, 4]
368
+ Add_126:0_max:
369
+ dtype: fp32
370
+ shape: [1]
371
+ location: [32807140, 4]
372
+ 277:0_quant_min:
373
+ dtype: fp32
374
+ shape: [1]
375
+ location: [32807144, 4]
376
+ 277:0_quant_max:
377
+ dtype: fp32
378
+ shape: [1]
379
+ location: [32807148, 4]
380
+ 245:0_quant_min:
381
+ dtype: fp32
382
+ shape: [1]
383
+ location: [32807152, 4]
384
+ 245:0_quant_max:
385
+ dtype: fp32
386
+ shape: [1]
387
+ location: [32807156, 4]
388
+ 283:0_min:
389
+ dtype: fp32
390
+ shape: [1]
391
+ location: [32807160, 4]
392
+ 283:0_max:
393
+ dtype: fp32
394
+ shape: [1]
395
+ location: [32807164, 4]
396
+ 284:0_quant_min:
397
+ dtype: fp32
398
+ shape: [1]
399
+ location: [32807176, 4]
400
+ 284:0_quant_max:
401
+ dtype: fp32
402
+ shape: [1]
403
+ location: [32807180, 4]
404
+ 262:0_quant_min:
405
+ dtype: fp32
406
+ shape: [1]
407
+ location: [32807184, 4]
408
+ 262:0_quant_max:
409
+ dtype: fp32
410
+ shape: [1]
411
+ location: [32807188, 4]
412
+ 286:0_min:
413
+ dtype: fp32
414
+ shape: [1]
415
+ location: [32807192, 4]
416
+ 286:0_max:
417
+ dtype: fp32
418
+ shape: [1]
419
+ location: [32807196, 4]
420
+ '598:0':
421
+ dtype: s8
422
+ shape: [256, 256]
423
+ location: [32807200, 65536]
424
+ bert.encoder.layer.1.attention.output.dense.bias:0:
425
+ dtype: s32
426
+ shape: [256]
427
+ location: [32872736, 1024]
428
+ 298:0_quant_min:
429
+ dtype: fp32
430
+ shape: [1]
431
+ location: [32875808, 4]
432
+ 298:0_quant_max:
433
+ dtype: fp32
434
+ shape: [1]
435
+ location: [32875812, 4]
436
+ 598:0_min:
437
+ dtype: fp32
438
+ shape: [256]
439
+ location: [32873760, 1024]
440
+ 598:0_max:
441
+ dtype: fp32
442
+ shape: [256]
443
+ location: [32874784, 1024]
444
+ 302:0_min:
445
+ dtype: fp32
446
+ shape: [1]
447
+ location: [32875816, 4]
448
+ 302:0_max:
449
+ dtype: fp32
450
+ shape: [1]
451
+ location: [32875820, 4]
452
+ bert.encoder.layer.1.attention.output.LayerNorm.weight:0:
453
+ dtype: fp32
454
+ shape: [256]
455
+ location: [32875824, 1024]
456
+ bert.encoder.layer.1.attention.output.LayerNorm.bias:0:
457
+ dtype: fp32
458
+ shape: [256]
459
+ location: [32876848, 1024]
460
+ 313:0_min:
461
+ dtype: fp32
462
+ shape: [1]
463
+ location: [32877872, 4]
464
+ 313:0_max:
465
+ dtype: fp32
466
+ shape: [1]
467
+ location: [32877876, 4]
468
+ '599:0':
469
+ dtype: s8
470
+ shape: [1024, 256]
471
+ location: [32877880, 262144]
472
+ bert.encoder.layer.1.intermediate.dense.bias:0:
473
+ dtype: s32
474
+ shape: [1024]
475
+ location: [33140024, 4096]
476
+ 313:0_quant_min:
477
+ dtype: fp32
478
+ shape: [1]
479
+ location: [33152312, 4]
480
+ 313:0_quant_max:
481
+ dtype: fp32
482
+ shape: [1]
483
+ location: [33152316, 4]
484
+ 599:0_min:
485
+ dtype: fp32
486
+ shape: [1024]
487
+ location: [33144120, 4096]
488
+ 599:0_max:
489
+ dtype: fp32
490
+ shape: [1024]
491
+ location: [33148216, 4096]
492
+ 324:0_quant_min:
493
+ dtype: fp32
494
+ shape: [1]
495
+ location: [33417552, 4]
496
+ 324:0_quant_max:
497
+ dtype: fp32
498
+ shape: [1]
499
+ location: [33417556, 4]
500
+ '600:0':
501
+ dtype: s8
502
+ shape: [256, 1024]
503
+ location: [33152336, 262144]
504
+ bert.encoder.layer.1.output.dense.bias:0:
505
+ dtype: s32
506
+ shape: [256]
507
+ location: [33414480, 1024]
508
+ 600:0_min:
509
+ dtype: fp32
510
+ shape: [256]
511
+ location: [33415504, 1024]
512
+ 600:0_max:
513
+ dtype: fp32
514
+ shape: [256]
515
+ location: [33416528, 1024]
516
+ 328:0_min:
517
+ dtype: fp32
518
+ shape: [1]
519
+ location: [33417560, 4]
520
+ 328:0_max:
521
+ dtype: fp32
522
+ shape: [1]
523
+ location: [33417564, 4]
524
+ bert.encoder.layer.1.output.LayerNorm.weight:0:
525
+ dtype: fp32
526
+ shape: [256]
527
+ location: [33417568, 1024]
528
+ bert.encoder.layer.1.output.LayerNorm.bias:0:
529
+ dtype: fp32
530
+ shape: [256]
531
+ location: [33418592, 1024]
532
+ 339:0_min:
533
+ dtype: fp32
534
+ shape: [1]
535
+ location: [33419616, 4]
536
+ 339:0_max:
537
+ dtype: fp32
538
+ shape: [1]
539
+ location: [33419620, 4]
540
+ '602:0':
541
+ dtype: s8
542
+ shape: [256, 256]
543
+ location: [33419624, 65536]
544
+ bert.encoder.layer.2.attention.self.key.bias:0:
545
+ dtype: s32
546
+ shape: [256]
547
+ location: [33485160, 1024]
548
+ 339:0_quant_min:
549
+ dtype: fp32
550
+ shape: [1]
551
+ location: [33625480, 4]
552
+ 339:0_quant_max:
553
+ dtype: fp32
554
+ shape: [1]
555
+ location: [33625484, 4]
556
+ 602:0_min:
557
+ dtype: fp32
558
+ shape: [256]
559
+ location: [33486184, 1024]
560
+ 602:0_max:
561
+ dtype: fp32
562
+ shape: [256]
563
+ location: [33487208, 1024]
564
+ Add_222:0_min:
565
+ dtype: fp32
566
+ shape: [1]
567
+ location: [33488240, 4]
568
+ Add_222:0_max:
569
+ dtype: fp32
570
+ shape: [1]
571
+ location: [33488244, 4]
572
+ '605:0':
573
+ dtype: s8
574
+ shape: [256, 256]
575
+ location: [33488248, 65536]
576
+ bert.encoder.layer.2.attention.self.value.bias:0:
577
+ dtype: s32
578
+ shape: [256]
579
+ location: [33553784, 1024]
580
+ 605:0_min:
581
+ dtype: fp32
582
+ shape: [256]
583
+ location: [33554808, 1024]
584
+ 605:0_max:
585
+ dtype: fp32
586
+ shape: [256]
587
+ location: [33555832, 1024]
588
+ Add_234:0_min:
589
+ dtype: fp32
590
+ shape: [1]
591
+ location: [33556864, 4]
592
+ Add_234:0_max:
593
+ dtype: fp32
594
+ shape: [1]
595
+ location: [33556868, 4]
596
+ '601:0':
597
+ dtype: s8
598
+ shape: [256, 256]
599
+ location: [33556872, 65536]
600
+ bert.encoder.layer.2.attention.self.query.bias:0:
601
+ dtype: s32
602
+ shape: [256]
603
+ location: [33622408, 1024]
604
+ 601:0_min:
605
+ dtype: fp32
606
+ shape: [256]
607
+ location: [33623432, 1024]
608
+ 601:0_max:
609
+ dtype: fp32
610
+ shape: [256]
611
+ location: [33624456, 1024]
612
+ Add_220:0_min:
613
+ dtype: fp32
614
+ shape: [1]
615
+ location: [33625488, 4]
616
+ Add_220:0_max:
617
+ dtype: fp32
618
+ shape: [1]
619
+ location: [33625492, 4]
620
+ 391:0_quant_min:
621
+ dtype: fp32
622
+ shape: [1]
623
+ location: [33625496, 4]
624
+ 391:0_quant_max:
625
+ dtype: fp32
626
+ shape: [1]
627
+ location: [33625500, 4]
628
+ 359:0_quant_min:
629
+ dtype: fp32
630
+ shape: [1]
631
+ location: [33625504, 4]
632
+ 359:0_quant_max:
633
+ dtype: fp32
634
+ shape: [1]
635
+ location: [33625508, 4]
636
+ 397:0_min:
637
+ dtype: fp32
638
+ shape: [1]
639
+ location: [33625512, 4]
640
+ 397:0_max:
641
+ dtype: fp32
642
+ shape: [1]
643
+ location: [33625516, 4]
644
+ 398:0_quant_min:
645
+ dtype: fp32
646
+ shape: [1]
647
+ location: [33625528, 4]
648
+ 398:0_quant_max:
649
+ dtype: fp32
650
+ shape: [1]
651
+ location: [33625532, 4]
652
+ 376:0_quant_min:
653
+ dtype: fp32
654
+ shape: [1]
655
+ location: [33625536, 4]
656
+ 376:0_quant_max:
657
+ dtype: fp32
658
+ shape: [1]
659
+ location: [33625540, 4]
660
+ 400:0_min:
661
+ dtype: fp32
662
+ shape: [1]
663
+ location: [33625544, 4]
664
+ 400:0_max:
665
+ dtype: fp32
666
+ shape: [1]
667
+ location: [33625548, 4]
668
+ '611:0':
669
+ dtype: s8
670
+ shape: [256, 256]
671
+ location: [33625552, 65536]
672
+ bert.encoder.layer.2.attention.output.dense.bias:0:
673
+ dtype: s32
674
+ shape: [256]
675
+ location: [33691088, 1024]
676
+ 412:0_quant_min:
677
+ dtype: fp32
678
+ shape: [1]
679
+ location: [33694160, 4]
680
+ 412:0_quant_max:
681
+ dtype: fp32
682
+ shape: [1]
683
+ location: [33694164, 4]
684
+ 611:0_min:
685
+ dtype: fp32
686
+ shape: [256]
687
+ location: [33692112, 1024]
688
+ 611:0_max:
689
+ dtype: fp32
690
+ shape: [256]
691
+ location: [33693136, 1024]
692
+ 416:0_min:
693
+ dtype: fp32
694
+ shape: [1]
695
+ location: [33694168, 4]
696
+ 416:0_max:
697
+ dtype: fp32
698
+ shape: [1]
699
+ location: [33694172, 4]
700
+ bert.encoder.layer.2.attention.output.LayerNorm.weight:0:
701
+ dtype: fp32
702
+ shape: [256]
703
+ location: [33694176, 1024]
704
+ bert.encoder.layer.2.attention.output.LayerNorm.bias:0:
705
+ dtype: fp32
706
+ shape: [256]
707
+ location: [33695200, 1024]
708
+ 427:0_min:
709
+ dtype: fp32
710
+ shape: [1]
711
+ location: [33696224, 4]
712
+ 427:0_max:
713
+ dtype: fp32
714
+ shape: [1]
715
+ location: [33696228, 4]
716
+ '612:0':
717
+ dtype: s8
718
+ shape: [1024, 256]
719
+ location: [33696232, 262144]
720
+ bert.encoder.layer.2.intermediate.dense.bias:0:
721
+ dtype: s32
722
+ shape: [1024]
723
+ location: [33958376, 4096]
724
+ 427:0_quant_min:
725
+ dtype: fp32
726
+ shape: [1]
727
+ location: [33970664, 4]
728
+ 427:0_quant_max:
729
+ dtype: fp32
730
+ shape: [1]
731
+ location: [33970668, 4]
732
+ 612:0_min:
733
+ dtype: fp32
734
+ shape: [1024]
735
+ location: [33962472, 4096]
736
+ 612:0_max:
737
+ dtype: fp32
738
+ shape: [1024]
739
+ location: [33966568, 4096]
740
+ 438:0_quant_min:
741
+ dtype: fp32
742
+ shape: [1]
743
+ location: [34235904, 4]
744
+ 438:0_quant_max:
745
+ dtype: fp32
746
+ shape: [1]
747
+ location: [34235908, 4]
748
+ '613:0':
749
+ dtype: s8
750
+ shape: [256, 1024]
751
+ location: [33970688, 262144]
752
+ bert.encoder.layer.2.output.dense.bias:0:
753
+ dtype: s32
754
+ shape: [256]
755
+ location: [34232832, 1024]
756
+ 613:0_min:
757
+ dtype: fp32
758
+ shape: [256]
759
+ location: [34233856, 1024]
760
+ 613:0_max:
761
+ dtype: fp32
762
+ shape: [256]
763
+ location: [34234880, 1024]
764
+ 442:0_min:
765
+ dtype: fp32
766
+ shape: [1]
767
+ location: [34235912, 4]
768
+ 442:0_max:
769
+ dtype: fp32
770
+ shape: [1]
771
+ location: [34235916, 4]
772
+ bert.encoder.layer.2.output.LayerNorm.weight:0:
773
+ dtype: fp32
774
+ shape: [256]
775
+ location: [34235920, 1024]
776
+ bert.encoder.layer.2.output.LayerNorm.bias:0:
777
+ dtype: fp32
778
+ shape: [256]
779
+ location: [34236944, 1024]
780
+ 453:0_min:
781
+ dtype: fp32
782
+ shape: [1]
783
+ location: [34237968, 4]
784
+ 453:0_max:
785
+ dtype: fp32
786
+ shape: [1]
787
+ location: [34237972, 4]
788
+ '615:0':
789
+ dtype: s8
790
+ shape: [256, 256]
791
+ location: [34237976, 65536]
792
+ bert.encoder.layer.3.attention.self.key.bias:0:
793
+ dtype: s32
794
+ shape: [256]
795
+ location: [34303512, 1024]
796
+ 453:0_quant_min:
797
+ dtype: fp32
798
+ shape: [1]
799
+ location: [34443832, 4]
800
+ 453:0_quant_max:
801
+ dtype: fp32
802
+ shape: [1]
803
+ location: [34443836, 4]
804
+ 615:0_min:
805
+ dtype: fp32
806
+ shape: [256]
807
+ location: [34304536, 1024]
808
+ 615:0_max:
809
+ dtype: fp32
810
+ shape: [256]
811
+ location: [34305560, 1024]
812
+ Add_316:0_min:
813
+ dtype: fp32
814
+ shape: [1]
815
+ location: [34306592, 4]
816
+ Add_316:0_max:
817
+ dtype: fp32
818
+ shape: [1]
819
+ location: [34306596, 4]
820
+ '618:0':
821
+ dtype: s8
822
+ shape: [256, 256]
823
+ location: [34306600, 65536]
824
+ bert.encoder.layer.3.attention.self.value.bias:0:
825
+ dtype: s32
826
+ shape: [256]
827
+ location: [34372136, 1024]
828
+ 618:0_min:
829
+ dtype: fp32
830
+ shape: [256]
831
+ location: [34373160, 1024]
832
+ 618:0_max:
833
+ dtype: fp32
834
+ shape: [256]
835
+ location: [34374184, 1024]
836
+ Add_328:0_min:
837
+ dtype: fp32
838
+ shape: [1]
839
+ location: [34375216, 4]
840
+ Add_328:0_max:
841
+ dtype: fp32
842
+ shape: [1]
843
+ location: [34375220, 4]
844
+ '614:0':
845
+ dtype: s8
846
+ shape: [256, 256]
847
+ location: [34375224, 65536]
848
+ bert.encoder.layer.3.attention.self.query.bias:0:
849
+ dtype: s32
850
+ shape: [256]
851
+ location: [34440760, 1024]
852
+ 614:0_min:
853
+ dtype: fp32
854
+ shape: [256]
855
+ location: [34441784, 1024]
856
+ 614:0_max:
857
+ dtype: fp32
858
+ shape: [256]
859
+ location: [34442808, 1024]
860
+ Add_314:0_min:
861
+ dtype: fp32
862
+ shape: [1]
863
+ location: [34443840, 4]
864
+ Add_314:0_max:
865
+ dtype: fp32
866
+ shape: [1]
867
+ location: [34443844, 4]
868
+ 505:0_quant_min:
869
+ dtype: fp32
870
+ shape: [1]
871
+ location: [34443848, 4]
872
+ 505:0_quant_max:
873
+ dtype: fp32
874
+ shape: [1]
875
+ location: [34443852, 4]
876
+ 473:0_quant_min:
877
+ dtype: fp32
878
+ shape: [1]
879
+ location: [34443856, 4]
880
+ 473:0_quant_max:
881
+ dtype: fp32
882
+ shape: [1]
883
+ location: [34443860, 4]
884
+ 511:0_min:
885
+ dtype: fp32
886
+ shape: [1]
887
+ location: [34443864, 4]
888
+ 511:0_max:
889
+ dtype: fp32
890
+ shape: [1]
891
+ location: [34443868, 4]
892
+ 512:0_quant_min:
893
+ dtype: fp32
894
+ shape: [1]
895
+ location: [34443880, 4]
896
+ 512:0_quant_max:
897
+ dtype: fp32
898
+ shape: [1]
899
+ location: [34443884, 4]
900
+ 490:0_quant_min:
901
+ dtype: fp32
902
+ shape: [1]
903
+ location: [34443888, 4]
904
+ 490:0_quant_max:
905
+ dtype: fp32
906
+ shape: [1]
907
+ location: [34443892, 4]
908
+ 514:0_min:
909
+ dtype: fp32
910
+ shape: [1]
911
+ location: [34443896, 4]
912
+ 514:0_max:
913
+ dtype: fp32
914
+ shape: [1]
915
+ location: [34443900, 4]
916
+ '624:0':
917
+ dtype: s8
918
+ shape: [256, 256]
919
+ location: [34443904, 65536]
920
+ bert.encoder.layer.3.attention.output.dense.bias:0:
921
+ dtype: s32
922
+ shape: [256]
923
+ location: [34509440, 1024]
924
+ 526:0_quant_min:
925
+ dtype: fp32
926
+ shape: [1]
927
+ location: [34512512, 4]
928
+ 526:0_quant_max:
929
+ dtype: fp32
930
+ shape: [1]
931
+ location: [34512516, 4]
932
+ 624:0_min:
933
+ dtype: fp32
934
+ shape: [256]
935
+ location: [34510464, 1024]
936
+ 624:0_max:
937
+ dtype: fp32
938
+ shape: [256]
939
+ location: [34511488, 1024]
940
+ 530:0_min:
941
+ dtype: fp32
942
+ shape: [1]
943
+ location: [34512520, 4]
944
+ 530:0_max:
945
+ dtype: fp32
946
+ shape: [1]
947
+ location: [34512524, 4]
948
+ bert.encoder.layer.3.attention.output.LayerNorm.weight:0:
949
+ dtype: fp32
950
+ shape: [256]
951
+ location: [34512528, 1024]
952
+ bert.encoder.layer.3.attention.output.LayerNorm.bias:0:
953
+ dtype: fp32
954
+ shape: [256]
955
+ location: [34513552, 1024]
956
+ 541:0_min:
957
+ dtype: fp32
958
+ shape: [1]
959
+ location: [34514576, 4]
960
+ 541:0_max:
961
+ dtype: fp32
962
+ shape: [1]
963
+ location: [34514580, 4]
964
+ '625:0':
965
+ dtype: s8
966
+ shape: [1024, 256]
967
+ location: [34514584, 262144]
968
+ bert.encoder.layer.3.intermediate.dense.bias:0:
969
+ dtype: s32
970
+ shape: [1024]
971
+ location: [34776728, 4096]
972
+ 541:0_quant_min:
973
+ dtype: fp32
974
+ shape: [1]
975
+ location: [34789016, 4]
976
+ 541:0_quant_max:
977
+ dtype: fp32
978
+ shape: [1]
979
+ location: [34789020, 4]
980
+ 625:0_min:
981
+ dtype: fp32
982
+ shape: [1024]
983
+ location: [34780824, 4096]
984
+ 625:0_max:
985
+ dtype: fp32
986
+ shape: [1024]
987
+ location: [34784920, 4096]
988
+ 552:0_quant_min:
989
+ dtype: fp32
990
+ shape: [1]
991
+ location: [35054256, 4]
992
+ 552:0_quant_max:
993
+ dtype: fp32
994
+ shape: [1]
995
+ location: [35054260, 4]
996
+ '626:0':
997
+ dtype: s8
998
+ shape: [256, 1024]
999
+ location: [34789040, 262144]
1000
+ bert.encoder.layer.3.output.dense.bias:0:
1001
+ dtype: s32
1002
+ shape: [256]
1003
+ location: [35051184, 1024]
1004
+ 626:0_min:
1005
+ dtype: fp32
1006
+ shape: [256]
1007
+ location: [35052208, 1024]
1008
+ 626:0_max:
1009
+ dtype: fp32
1010
+ shape: [256]
1011
+ location: [35053232, 1024]
1012
+ 556:0_min:
1013
+ dtype: fp32
1014
+ shape: [1]
1015
+ location: [35054264, 4]
1016
+ 556:0_max:
1017
+ dtype: fp32
1018
+ shape: [1]
1019
+ location: [35054268, 4]
1020
+ bert.encoder.layer.3.output.LayerNorm.weight:0:
1021
+ dtype: fp32
1022
+ shape: [256]
1023
+ location: [35054272, 1024]
1024
+ bert.encoder.layer.3.output.LayerNorm.bias:0:
1025
+ dtype: fp32
1026
+ shape: [256]
1027
+ location: [35055296, 1024]
1028
+ 569:0_min:
1029
+ dtype: fp32
1030
+ shape: [1]
1031
+ location: [35056320, 4]
1032
+ 569:0_max:
1033
+ dtype: fp32
1034
+ shape: [1]
1035
+ location: [35056324, 4]
1036
+ bert.pooler.dense.weight:0:
1037
+ dtype: s8
1038
+ shape: [256, 256]
1039
+ location: [35056328, 65536]
1040
+ bert.pooler.dense.bias:0:
1041
+ dtype: s32
1042
+ shape: [256]
1043
+ location: [35121864, 1024]
1044
+ 569:0_quant_min:
1045
+ dtype: fp32
1046
+ shape: [1]
1047
+ location: [35122888, 4]
1048
+ 569:0_quant_max:
1049
+ dtype: fp32
1050
+ shape: [1]
1051
+ location: [35122892, 4]
1052
+ bert.pooler.dense.weight:0_min:
1053
+ dtype: fp32
1054
+ shape: [256]
1055
+ location: [35122896, 1024]
1056
+ bert.pooler.dense.weight:0_max:
1057
+ dtype: fp32
1058
+ shape: [256]
1059
+ location: [35123920, 1024]
1060
+ 571:0_quant_min:
1061
+ dtype: fp32
1062
+ shape: [1]
1063
+ location: [35125472, 4]
1064
+ 571:0_quant_max:
1065
+ dtype: fp32
1066
+ shape: [1]
1067
+ location: [35125476, 4]
1068
+ classifier.weight:0:
1069
+ dtype: s8
1070
+ shape: [2, 256]
1071
+ location: [35124952, 512]
1072
+ classifier.bias:0:
1073
+ dtype: s32
1074
+ shape: [2]
1075
+ location: [35125464, 8]
1076
+ classifier.weight:0_min:
1077
+ dtype: fp32
1078
+ shape: [2]
1079
+ location: [35125480, 8]
1080
+ classifier.weight:0_max:
1081
+ dtype: fp32
1082
+ shape: [2]
1083
+ location: [35125488, 8]
1084
+ output:0_min:
1085
+ dtype: fp32
1086
+ shape: [1]
1087
+ location: [35125496, 4]
1088
+ output:0_max:
1089
+ dtype: fp32
1090
+ shape: [1]
1091
+ location: [35125500, 4]
1092
+ padding_sequence:
1093
+ type: PaddingSequence
1094
+ input:
1095
+ input_mask:0: {}
1096
+ output:
1097
+ padding_sequence:0: {}
1098
+ attr:
1099
+ dst_shape: -1,4,0,-1
1100
+ dims: 1
1101
+ position_embeddings/after/reshape:
1102
+ type: Reshape
1103
+ input:
1104
+ bert.embeddings.position_embeddings.weight:0: {}
1105
+ input_ids:0: {}
1106
+ output:
1107
+ position_embeddings/after/reshape:0: {}
1108
+ attr:
1109
+ dst_shape: 1,-1,256
1110
+ dims: 1
1111
+ Gather_18:
1112
+ type: Reshape
1113
+ input:
1114
+ position_embeddings/after/reshape:0: {}
1115
+ output:
1116
+ '99:0': {}
1117
+ attr:
1118
+ dst_shape: 1,-1
1119
+ word_embeddings/reshape:
1120
+ type: Reshape
1121
+ input:
1122
+ input_ids:0: {}
1123
+ output:
1124
+ word_embeddings/reshape:0: {}
1125
+ attr:
1126
+ dst_shape: -1
1127
+ Gather_15:
1128
+ type: Gather
1129
+ input:
1130
+ word_embeddings/reshape:0: {}
1131
+ bert.embeddings.word_embeddings.weight:0: {}
1132
+ output:
1133
+ Gather_15:0: {}
1134
+ attr:
1135
+ axis: 0
1136
+ batch_dims: 0
1137
+ word_embeddings/after/reshape:
1138
+ type: Reshape
1139
+ input:
1140
+ Gather_15:0: {}
1141
+ input_ids:0: {}
1142
+ output:
1143
+ word_embeddings/after/reshape:0: {}
1144
+ attr:
1145
+ dst_shape: -1,-1,256
1146
+ dims: 0,1
1147
+ word_embeddings/add_reshape:
1148
+ type: Reshape
1149
+ input:
1150
+ word_embeddings/after/reshape:0: {}
1151
+ input_ids:0: {}
1152
+ output:
1153
+ word_embeddings/add_reshape:0: {}
1154
+ attr:
1155
+ dst_shape: -1,-1,256
1156
+ dims: 0,1
1157
+ mul: 1,2
1158
+ token_type_embeddings/reshape:
1159
+ type: Reshape
1160
+ input:
1161
+ segment_ids:0: {}
1162
+ output:
1163
+ token_type_embeddings/reshape:0: {}
1164
+ attr:
1165
+ dst_shape: -1
1166
+ Gather_16:
1167
+ type: Gather
1168
+ input:
1169
+ token_type_embeddings/reshape:0: {}
1170
+ bert.embeddings.token_type_embeddings.weight:0: {}
1171
+ output:
1172
+ Gather_16:0: {}
1173
+ attr:
1174
+ axis: 0
1175
+ batch_dims: 0
1176
+ token_type_embeddings/after/reshape:
1177
+ type: Reshape
1178
+ input:
1179
+ Gather_16:0: {}
1180
+ segment_ids:0: {}
1181
+ output:
1182
+ token_type_embeddings/after/reshape:0: {}
1183
+ attr:
1184
+ dst_shape: -1,-1,256
1185
+ dims: 0,1
1186
+ token_type_embeddings/add_reshape:
1187
+ type: Reshape
1188
+ input:
1189
+ token_type_embeddings/after/reshape:0: {}
1190
+ segment_ids:0: {}
1191
+ output:
1192
+ token_type_embeddings/add_reshape:0: {}
1193
+ attr:
1194
+ dst_shape: -1,-1,256
1195
+ dims: 0,1
1196
+ mul: 1,2
1197
+ Add_17:
1198
+ type: BinaryAdd
1199
+ input:
1200
+ token_type_embeddings/add_reshape:0: {}
1201
+ '99:0': {}
1202
+ word_embeddings/add_reshape:0: {}
1203
+ output:
1204
+ Add_17:0: {}
1205
+ attr:
1206
+ append_op: sum
1207
+ embeddings/after_add_reshape:
1208
+ type: Reshape
1209
+ input:
1210
+ Add_17:0: {}
1211
+ input_ids:0: {}
1212
+ output:
1213
+ embeddings/after_add_reshape:0: {}
1214
+ attr:
1215
+ dst_shape: -1,-1,256
1216
+ dims: 0,1
1217
+ embeddings_add/reshape_2d:
1218
+ type: Reshape
1219
+ input:
1220
+ embeddings/after_add_reshape:0: {}
1221
+ output:
1222
+ embeddings_add/reshape_2d:0: {}
1223
+ attr:
1224
+ dst_shape: -1,256
1225
+ Add_30:
1226
+ type: LayerNorm
1227
+ input:
1228
+ embeddings_add/reshape_2d:0: {}
1229
+ bert.embeddings.LayerNorm.weight:0: {}
1230
+ bert.embeddings.LayerNorm.bias:0: {}
1231
+ output:
1232
+ '111:0': {}
1233
+ attr:
1234
+ epsilon: 9.999999960041972e-13
1235
+ Add_30_reorder_post:
1236
+ type: Reorder
1237
+ input:
1238
+ '111:0': {}
1239
+ output:
1240
+ 111:0_reorder: {}
1241
+ attr:
1242
+ src_perm: 0,1
1243
+ dst_perm: 1,0
1244
+ Add_34_quant_0:
1245
+ type: Quantize
1246
+ input:
1247
+ 111:0_reorder: {}
1248
+ 111:0_min: {}
1249
+ 111:0_max: {}
1250
+ output:
1251
+ 111:0_quant: {}
1252
+ attr:
1253
+ output_dtype: u8
1254
+ Add_34:
1255
+ type: InnerProduct
1256
+ input:
1257
+ '576:0': {}
1258
+ 111:0_quant: {}
1259
+ bert.encoder.layer.0.attention.self.key.bias:0: {}
1260
+ 576:0_min: {}
1261
+ 576:0_max: {}
1262
+ 111:0_quant_min: {}
1263
+ 111:0_quant_max: {}
1264
+ Add_34:0_min: {}
1265
+ Add_34:0_max: {}
1266
+ output:
1267
+ Add_34:0: {}
1268
+ attr:
1269
+ output_dtype: s8
1270
+ Reshape_44:
1271
+ type: Reshape
1272
+ input:
1273
+ Add_34:0: {}
1274
+ input_ids:0: {}
1275
+ output:
1276
+ 131:0_quant: {}
1277
+ attr:
1278
+ dst_shape: 4,64,-1,-1
1279
+ dims: '0'
1280
+ Add_46:
1281
+ type: InnerProduct
1282
+ input:
1283
+ '579:0': {}
1284
+ 111:0_quant: {}
1285
+ bert.encoder.layer.0.attention.self.value.bias:0: {}
1286
+ 579:0_min: {}
1287
+ 579:0_max: {}
1288
+ 111:0_quant_min: {}
1289
+ 111:0_quant_max: {}
1290
+ Add_46:0_min: {}
1291
+ Add_46:0_max: {}
1292
+ output:
1293
+ Add_46:0: {}
1294
+ attr:
1295
+ output_dtype: s8
1296
+ Reshape_56:
1297
+ type: Reshape
1298
+ input:
1299
+ Add_46:0: {}
1300
+ input_ids:0: {}
1301
+ output:
1302
+ 148:0_quant: {}
1303
+ attr:
1304
+ dst_shape: 4,64,-1,-1
1305
+ dims: '0'
1306
+ Add_32:
1307
+ type: InnerProduct
1308
+ input:
1309
+ '575:0': {}
1310
+ 111:0_quant: {}
1311
+ bert.encoder.layer.0.attention.self.query.bias:0: {}
1312
+ 575:0_min: {}
1313
+ 575:0_max: {}
1314
+ 111:0_quant_min: {}
1315
+ 111:0_quant_max: {}
1316
+ Add_32:0_min: {}
1317
+ Add_32:0_max: {}
1318
+ output:
1319
+ Add_32:0: {}
1320
+ attr:
1321
+ output_dtype: s8
1322
+ Reshape_67:
1323
+ type: Reshape
1324
+ input:
1325
+ Add_32:0: {}
1326
+ input_ids:0: {}
1327
+ output:
1328
+ 163:0_quant: {}
1329
+ attr:
1330
+ dst_shape: 4,64,-1,-1
1331
+ dims: '0'
1332
+ Add_73:
1333
+ type: Matmul
1334
+ input:
1335
+ 163:0_quant: {}
1336
+ 131:0_quant: {}
1337
+ padding_sequence:0: {}
1338
+ 163:0_quant_min: {}
1339
+ 163:0_quant_max: {}
1340
+ 131:0_quant_min: {}
1341
+ 131:0_quant_max: {}
1342
+ 169:0_min: {}
1343
+ 169:0_max: {}
1344
+ output:
1345
+ '169:0': {}
1346
+ attr:
1347
+ src0_perm: 2,0,3,1
1348
+ src1_perm: 2,0,1,3
1349
+ output_scale: 0.125
1350
+ format_any: false
1351
+ append_op: binary_add
1352
+ Softmax_74:
1353
+ type: Softmax
1354
+ input:
1355
+ '169:0': {}
1356
+ 170:0_quant_min: {}
1357
+ 170:0_quant_max: {}
1358
+ output:
1359
+ 170:0_quant: {}
1360
+ attr:
1361
+ output_dtype: u8
1362
+ Transpose_76:
1363
+ type: Matmul
1364
+ input:
1365
+ 170:0_quant: {}
1366
+ 148:0_quant: {}
1367
+ 170:0_quant_min: {}
1368
+ 170:0_quant_max: {}
1369
+ 148:0_quant_min: {}
1370
+ 148:0_quant_max: {}
1371
+ 172:0_min: {}
1372
+ 172:0_max: {}
1373
+ output:
1374
+ '172:0': {}
1375
+ attr:
1376
+ src1_perm: 2,0,3,1
1377
+ dst_perm: 1,3,0,2
1378
+ output_dtype: u8
1379
+ Reshape_86:
1380
+ type: Reshape
1381
+ input:
1382
+ '172:0': {}
1383
+ output:
1384
+ 184:0_quant: {}
1385
+ attr:
1386
+ dst_shape: 256,-1
1387
+ Add_89:
1388
+ type: InnerProduct
1389
+ input:
1390
+ '585:0': {}
1391
+ 184:0_quant: {}
1392
+ bert.encoder.layer.0.attention.output.dense.bias:0: {}
1393
+ 111:0_reorder: {}
1394
+ 585:0_min: {}
1395
+ 585:0_max: {}
1396
+ 184:0_quant_min: {}
1397
+ 184:0_quant_max: {}
1398
+ 188:0_min: {}
1399
+ 188:0_max: {}
1400
+ output:
1401
+ '188:0': {}
1402
+ attr:
1403
+ append_op: sum
1404
+ Add_100:
1405
+ type: LayerNorm
1406
+ input:
1407
+ '188:0': {}
1408
+ bert.encoder.layer.0.attention.output.LayerNorm.weight:0: {}
1409
+ bert.encoder.layer.0.attention.output.LayerNorm.bias:0: {}
1410
+ output:
1411
+ '199:0': {}
1412
+ attr:
1413
+ epsilon: 9.999999960041972e-13
1414
+ transpose_mode: 1,0
1415
+ Mul_110_quant_0:
1416
+ type: Quantize
1417
+ input:
1418
+ '199:0': {}
1419
+ 199:0_min: {}
1420
+ 199:0_max: {}
1421
+ output:
1422
+ 199:0_quant: {}
1423
+ attr:
1424
+ output_dtype: u8
1425
+ Mul_110:
1426
+ type: InnerProduct
1427
+ input:
1428
+ '586:0': {}
1429
+ 199:0_quant: {}
1430
+ bert.encoder.layer.0.intermediate.dense.bias:0: {}
1431
+ 586:0_min: {}
1432
+ 586:0_max: {}
1433
+ 199:0_quant_min: {}
1434
+ 199:0_quant_max: {}
1435
+ 210:0_quant_min: {}
1436
+ 210:0_quant_max: {}
1437
+ output:
1438
+ 210:0_quant: {}
1439
+ Mul_110_gelu:
1440
+ type: Gelu
1441
+ input:
1442
+ 210:0_quant: {}
1443
+ output:
1444
+ 210:0_quant_gelu: {}
1445
+ attr:
1446
+ algorithm: gelu_tanh
1447
+ Mul_110_gelu_quant:
1448
+ type: Quantize
1449
+ input:
1450
+ 210:0_quant_gelu: {}
1451
+ 210:0_quant_min: {}
1452
+ 210:0_quant_max: {}
1453
+ output:
1454
+ 210:0_quant_quant: {}
1455
+ attr:
1456
+ output_dtype: u8
1457
+ Add_113:
1458
+ type: InnerProduct
1459
+ input:
1460
+ '587:0': {}
1461
+ 210:0_quant_quant: {}
1462
+ bert.encoder.layer.0.output.dense.bias:0: {}
1463
+ '199:0': {}
1464
+ 587:0_min: {}
1465
+ 587:0_max: {}
1466
+ 210:0_quant_min: {}
1467
+ 210:0_quant_max: {}
1468
+ 214:0_min: {}
1469
+ 214:0_max: {}
1470
+ output:
1471
+ '214:0': {}
1472
+ attr:
1473
+ append_op: sum
1474
+ Add_124:
1475
+ type: LayerNorm
1476
+ input:
1477
+ '214:0': {}
1478
+ bert.encoder.layer.0.output.LayerNorm.weight:0: {}
1479
+ bert.encoder.layer.0.output.LayerNorm.bias:0: {}
1480
+ output:
1481
+ '225:0': {}
1482
+ attr:
1483
+ epsilon: 9.999999960041972e-13
1484
+ transpose_mode: 1,0
1485
+ Add_128_quant_0:
1486
+ type: Quantize
1487
+ input:
1488
+ '225:0': {}
1489
+ 225:0_min: {}
1490
+ 225:0_max: {}
1491
+ output:
1492
+ 225:0_quant: {}
1493
+ attr:
1494
+ output_dtype: u8
1495
+ Add_128:
1496
+ type: InnerProduct
1497
+ input:
1498
+ '589:0': {}
1499
+ 225:0_quant: {}
1500
+ bert.encoder.layer.1.attention.self.key.bias:0: {}
1501
+ 589:0_min: {}
1502
+ 589:0_max: {}
1503
+ 225:0_quant_min: {}
1504
+ 225:0_quant_max: {}
1505
+ Add_128:0_min: {}
1506
+ Add_128:0_max: {}
1507
+ output:
1508
+ Add_128:0: {}
1509
+ attr:
1510
+ output_dtype: s8
1511
+ Reshape_138:
1512
+ type: Reshape
1513
+ input:
1514
+ Add_128:0: {}
1515
+ input_ids:0: {}
1516
+ output:
1517
+ 245:0_quant: {}
1518
+ attr:
1519
+ dst_shape: 4,64,-1,-1
1520
+ dims: '0'
1521
+ Add_140:
1522
+ type: InnerProduct
1523
+ input:
1524
+ '592:0': {}
1525
+ 225:0_quant: {}
1526
+ bert.encoder.layer.1.attention.self.value.bias:0: {}
1527
+ 592:0_min: {}
1528
+ 592:0_max: {}
1529
+ 225:0_quant_min: {}
1530
+ 225:0_quant_max: {}
1531
+ Add_140:0_min: {}
1532
+ Add_140:0_max: {}
1533
+ output:
1534
+ Add_140:0: {}
1535
+ attr:
1536
+ output_dtype: s8
1537
+ Reshape_150:
1538
+ type: Reshape
1539
+ input:
1540
+ Add_140:0: {}
1541
+ input_ids:0: {}
1542
+ output:
1543
+ 262:0_quant: {}
1544
+ attr:
1545
+ dst_shape: 4,64,-1,-1
1546
+ dims: '0'
1547
+ Add_126:
1548
+ type: InnerProduct
1549
+ input:
1550
+ '588:0': {}
1551
+ 225:0_quant: {}
1552
+ bert.encoder.layer.1.attention.self.query.bias:0: {}
1553
+ 588:0_min: {}
1554
+ 588:0_max: {}
1555
+ 225:0_quant_min: {}
1556
+ 225:0_quant_max: {}
1557
+ Add_126:0_min: {}
1558
+ Add_126:0_max: {}
1559
+ output:
1560
+ Add_126:0: {}
1561
+ attr:
1562
+ output_dtype: s8
1563
+ Reshape_161:
1564
+ type: Reshape
1565
+ input:
1566
+ Add_126:0: {}
1567
+ input_ids:0: {}
1568
+ output:
1569
+ 277:0_quant: {}
1570
+ attr:
1571
+ dst_shape: 4,64,-1,-1
1572
+ dims: '0'
1573
+ Add_167:
1574
+ type: Matmul
1575
+ input:
1576
+ 277:0_quant: {}
1577
+ 245:0_quant: {}
1578
+ padding_sequence:0: {}
1579
+ 277:0_quant_min: {}
1580
+ 277:0_quant_max: {}
1581
+ 245:0_quant_min: {}
1582
+ 245:0_quant_max: {}
1583
+ 283:0_min: {}
1584
+ 283:0_max: {}
1585
+ output:
1586
+ '283:0': {}
1587
+ attr:
1588
+ src0_perm: 2,0,3,1
1589
+ src1_perm: 2,0,1,3
1590
+ output_scale: 0.125
1591
+ format_any: false
1592
+ append_op: binary_add
1593
+ Softmax_168:
1594
+ type: Softmax
1595
+ input:
1596
+ '283:0': {}
1597
+ 284:0_quant_min: {}
1598
+ 284:0_quant_max: {}
1599
+ output:
1600
+ 284:0_quant: {}
1601
+ attr:
1602
+ output_dtype: u8
1603
+ Transpose_170:
1604
+ type: Matmul
1605
+ input:
1606
+ 284:0_quant: {}
1607
+ 262:0_quant: {}
1608
+ 284:0_quant_min: {}
1609
+ 284:0_quant_max: {}
1610
+ 262:0_quant_min: {}
1611
+ 262:0_quant_max: {}
1612
+ 286:0_min: {}
1613
+ 286:0_max: {}
1614
+ output:
1615
+ '286:0': {}
1616
+ attr:
1617
+ src1_perm: 2,0,3,1
1618
+ dst_perm: 1,3,0,2
1619
+ output_dtype: u8
1620
+ Reshape_180:
1621
+ type: Reshape
1622
+ input:
1623
+ '286:0': {}
1624
+ output:
1625
+ 298:0_quant: {}
1626
+ attr:
1627
+ dst_shape: 256,-1
1628
+ Add_183:
1629
+ type: InnerProduct
1630
+ input:
1631
+ '598:0': {}
1632
+ 298:0_quant: {}
1633
+ bert.encoder.layer.1.attention.output.dense.bias:0: {}
1634
+ '225:0': {}
1635
+ 598:0_min: {}
1636
+ 598:0_max: {}
1637
+ 298:0_quant_min: {}
1638
+ 298:0_quant_max: {}
1639
+ 302:0_min: {}
1640
+ 302:0_max: {}
1641
+ output:
1642
+ '302:0': {}
1643
+ attr:
1644
+ append_op: sum
1645
+ Add_194:
1646
+ type: LayerNorm
1647
+ input:
1648
+ '302:0': {}
1649
+ bert.encoder.layer.1.attention.output.LayerNorm.weight:0: {}
1650
+ bert.encoder.layer.1.attention.output.LayerNorm.bias:0: {}
1651
+ output:
1652
+ '313:0': {}
1653
+ attr:
1654
+ epsilon: 9.999999960041972e-13
1655
+ transpose_mode: 1,0
1656
+ Mul_204_quant_0:
1657
+ type: Quantize
1658
+ input:
1659
+ '313:0': {}
1660
+ 313:0_min: {}
1661
+ 313:0_max: {}
1662
+ output:
1663
+ 313:0_quant: {}
1664
+ attr:
1665
+ output_dtype: u8
1666
+ Mul_204:
1667
+ type: InnerProduct
1668
+ input:
1669
+ '599:0': {}
1670
+ 313:0_quant: {}
1671
+ bert.encoder.layer.1.intermediate.dense.bias:0: {}
1672
+ 599:0_min: {}
1673
+ 599:0_max: {}
1674
+ 313:0_quant_min: {}
1675
+ 313:0_quant_max: {}
1676
+ 324:0_quant_min: {}
1677
+ 324:0_quant_max: {}
1678
+ output:
1679
+ 324:0_quant: {}
1680
+ Mul_204_gelu:
1681
+ type: Gelu
1682
+ input:
1683
+ 324:0_quant: {}
1684
+ output:
1685
+ 324:0_quant_gelu: {}
1686
+ attr:
1687
+ algorithm: gelu_tanh
1688
+ Mul_204_gelu_quant:
1689
+ type: Quantize
1690
+ input:
1691
+ 324:0_quant_gelu: {}
1692
+ 324:0_quant_min: {}
1693
+ 324:0_quant_max: {}
1694
+ output:
1695
+ 324:0_quant_quant: {}
1696
+ attr:
1697
+ output_dtype: u8
1698
+ Add_207:
1699
+ type: InnerProduct
1700
+ input:
1701
+ '600:0': {}
1702
+ 324:0_quant_quant: {}
1703
+ bert.encoder.layer.1.output.dense.bias:0: {}
1704
+ '313:0': {}
1705
+ 600:0_min: {}
1706
+ 600:0_max: {}
1707
+ 324:0_quant_min: {}
1708
+ 324:0_quant_max: {}
1709
+ 328:0_min: {}
1710
+ 328:0_max: {}
1711
+ output:
1712
+ '328:0': {}
1713
+ attr:
1714
+ append_op: sum
1715
+ Add_218:
1716
+ type: LayerNorm
1717
+ input:
1718
+ '328:0': {}
1719
+ bert.encoder.layer.1.output.LayerNorm.weight:0: {}
1720
+ bert.encoder.layer.1.output.LayerNorm.bias:0: {}
1721
+ output:
1722
+ '339:0': {}
1723
+ attr:
1724
+ epsilon: 9.999999960041972e-13
1725
+ transpose_mode: 1,0
1726
+ Add_222_quant_0:
1727
+ type: Quantize
1728
+ input:
1729
+ '339:0': {}
1730
+ 339:0_min: {}
1731
+ 339:0_max: {}
1732
+ output:
1733
+ 339:0_quant: {}
1734
+ attr:
1735
+ output_dtype: u8
1736
+ Add_222:
1737
+ type: InnerProduct
1738
+ input:
1739
+ '602:0': {}
1740
+ 339:0_quant: {}
1741
+ bert.encoder.layer.2.attention.self.key.bias:0: {}
1742
+ 602:0_min: {}
1743
+ 602:0_max: {}
1744
+ 339:0_quant_min: {}
1745
+ 339:0_quant_max: {}
1746
+ Add_222:0_min: {}
1747
+ Add_222:0_max: {}
1748
+ output:
1749
+ Add_222:0: {}
1750
+ attr:
1751
+ output_dtype: s8
1752
+ Reshape_232:
1753
+ type: Reshape
1754
+ input:
1755
+ Add_222:0: {}
1756
+ input_ids:0: {}
1757
+ output:
1758
+ 359:0_quant: {}
1759
+ attr:
1760
+ dst_shape: 4,64,-1,-1
1761
+ dims: '0'
1762
+ Add_234:
1763
+ type: InnerProduct
1764
+ input:
1765
+ '605:0': {}
1766
+ 339:0_quant: {}
1767
+ bert.encoder.layer.2.attention.self.value.bias:0: {}
1768
+ 605:0_min: {}
1769
+ 605:0_max: {}
1770
+ 339:0_quant_min: {}
1771
+ 339:0_quant_max: {}
1772
+ Add_234:0_min: {}
1773
+ Add_234:0_max: {}
1774
+ output:
1775
+ Add_234:0: {}
1776
+ attr:
1777
+ output_dtype: s8
1778
+ Reshape_244:
1779
+ type: Reshape
1780
+ input:
1781
+ Add_234:0: {}
1782
+ input_ids:0: {}
1783
+ output:
1784
+ 376:0_quant: {}
1785
+ attr:
1786
+ dst_shape: 4,64,-1,-1
1787
+ dims: '0'
1788
+ Add_220:
1789
+ type: InnerProduct
1790
+ input:
1791
+ '601:0': {}
1792
+ 339:0_quant: {}
1793
+ bert.encoder.layer.2.attention.self.query.bias:0: {}
1794
+ 601:0_min: {}
1795
+ 601:0_max: {}
1796
+ 339:0_quant_min: {}
1797
+ 339:0_quant_max: {}
1798
+ Add_220:0_min: {}
1799
+ Add_220:0_max: {}
1800
+ output:
1801
+ Add_220:0: {}
1802
+ attr:
1803
+ output_dtype: s8
1804
+ Reshape_255:
1805
+ type: Reshape
1806
+ input:
1807
+ Add_220:0: {}
1808
+ input_ids:0: {}
1809
+ output:
1810
+ 391:0_quant: {}
1811
+ attr:
1812
+ dst_shape: 4,64,-1,-1
1813
+ dims: '0'
1814
+ Add_261:
1815
+ type: Matmul
1816
+ input:
1817
+ 391:0_quant: {}
1818
+ 359:0_quant: {}
1819
+ padding_sequence:0: {}
1820
+ 391:0_quant_min: {}
1821
+ 391:0_quant_max: {}
1822
+ 359:0_quant_min: {}
1823
+ 359:0_quant_max: {}
1824
+ 397:0_min: {}
1825
+ 397:0_max: {}
1826
+ output:
1827
+ '397:0': {}
1828
+ attr:
1829
+ src0_perm: 2,0,3,1
1830
+ src1_perm: 2,0,1,3
1831
+ output_scale: 0.125
1832
+ format_any: false
1833
+ append_op: binary_add
1834
+ Softmax_262:
1835
+ type: Softmax
1836
+ input:
1837
+ '397:0': {}
1838
+ 398:0_quant_min: {}
1839
+ 398:0_quant_max: {}
1840
+ output:
1841
+ 398:0_quant: {}
1842
+ attr:
1843
+ output_dtype: u8
1844
+ Transpose_264:
1845
+ type: Matmul
1846
+ input:
1847
+ 398:0_quant: {}
1848
+ 376:0_quant: {}
1849
+ 398:0_quant_min: {}
1850
+ 398:0_quant_max: {}
1851
+ 376:0_quant_min: {}
1852
+ 376:0_quant_max: {}
1853
+ 400:0_min: {}
1854
+ 400:0_max: {}
1855
+ output:
1856
+ '400:0': {}
1857
+ attr:
1858
+ src1_perm: 2,0,3,1
1859
+ dst_perm: 1,3,0,2
1860
+ output_dtype: u8
1861
+ Reshape_274:
1862
+ type: Reshape
1863
+ input:
1864
+ '400:0': {}
1865
+ output:
1866
+ 412:0_quant: {}
1867
+ attr:
1868
+ dst_shape: 256,-1
1869
+ Add_277:
1870
+ type: InnerProduct
1871
+ input:
1872
+ '611:0': {}
1873
+ 412:0_quant: {}
1874
+ bert.encoder.layer.2.attention.output.dense.bias:0: {}
1875
+ '339:0': {}
1876
+ 611:0_min: {}
1877
+ 611:0_max: {}
1878
+ 412:0_quant_min: {}
1879
+ 412:0_quant_max: {}
1880
+ 416:0_min: {}
1881
+ 416:0_max: {}
1882
+ output:
1883
+ '416:0': {}
1884
+ attr:
1885
+ append_op: sum
1886
+ Add_288:
1887
+ type: LayerNorm
1888
+ input:
1889
+ '416:0': {}
1890
+ bert.encoder.layer.2.attention.output.LayerNorm.weight:0: {}
1891
+ bert.encoder.layer.2.attention.output.LayerNorm.bias:0: {}
1892
+ output:
1893
+ '427:0': {}
1894
+ attr:
1895
+ epsilon: 9.999999960041972e-13
1896
+ transpose_mode: 1,0
1897
+ Mul_298_quant_0:
1898
+ type: Quantize
1899
+ input:
1900
+ '427:0': {}
1901
+ 427:0_min: {}
1902
+ 427:0_max: {}
1903
+ output:
1904
+ 427:0_quant: {}
1905
+ attr:
1906
+ output_dtype: u8
1907
+ Mul_298:
1908
+ type: InnerProduct
1909
+ input:
1910
+ '612:0': {}
1911
+ 427:0_quant: {}
1912
+ bert.encoder.layer.2.intermediate.dense.bias:0: {}
1913
+ 612:0_min: {}
1914
+ 612:0_max: {}
1915
+ 427:0_quant_min: {}
1916
+ 427:0_quant_max: {}
1917
+ 438:0_quant_min: {}
1918
+ 438:0_quant_max: {}
1919
+ output:
1920
+ 438:0_quant: {}
1921
+ Mul_298_gelu:
1922
+ type: Gelu
1923
+ input:
1924
+ 438:0_quant: {}
1925
+ output:
1926
+ 438:0_quant_gelu: {}
1927
+ attr:
1928
+ algorithm: gelu_tanh
1929
+ Mul_298_gelu_quant:
1930
+ type: Quantize
1931
+ input:
1932
+ 438:0_quant_gelu: {}
1933
+ 438:0_quant_min: {}
1934
+ 438:0_quant_max: {}
1935
+ output:
1936
+ 438:0_quant_quant: {}
1937
+ attr:
1938
+ output_dtype: u8
1939
+ Add_301:
1940
+ type: InnerProduct
1941
+ input:
1942
+ '613:0': {}
1943
+ 438:0_quant_quant: {}
1944
+ bert.encoder.layer.2.output.dense.bias:0: {}
1945
+ '427:0': {}
1946
+ 613:0_min: {}
1947
+ 613:0_max: {}
1948
+ 438:0_quant_min: {}
1949
+ 438:0_quant_max: {}
1950
+ 442:0_min: {}
1951
+ 442:0_max: {}
1952
+ output:
1953
+ '442:0': {}
1954
+ attr:
1955
+ append_op: sum
1956
+ Add_312:
1957
+ type: LayerNorm
1958
+ input:
1959
+ '442:0': {}
1960
+ bert.encoder.layer.2.output.LayerNorm.weight:0: {}
1961
+ bert.encoder.layer.2.output.LayerNorm.bias:0: {}
1962
+ output:
1963
+ '453:0': {}
1964
+ attr:
1965
+ epsilon: 9.999999960041972e-13
1966
+ transpose_mode: 1,0
1967
+ Add_316_quant_0:
1968
+ type: Quantize
1969
+ input:
1970
+ '453:0': {}
1971
+ 453:0_min: {}
1972
+ 453:0_max: {}
1973
+ output:
1974
+ 453:0_quant: {}
1975
+ attr:
1976
+ output_dtype: u8
1977
+ Add_316:
1978
+ type: InnerProduct
1979
+ input:
1980
+ '615:0': {}
1981
+ 453:0_quant: {}
1982
+ bert.encoder.layer.3.attention.self.key.bias:0: {}
1983
+ 615:0_min: {}
1984
+ 615:0_max: {}
1985
+ 453:0_quant_min: {}
1986
+ 453:0_quant_max: {}
1987
+ Add_316:0_min: {}
1988
+ Add_316:0_max: {}
1989
+ output:
1990
+ Add_316:0: {}
1991
+ attr:
1992
+ output_dtype: s8
1993
+ Reshape_326:
1994
+ type: Reshape
1995
+ input:
1996
+ Add_316:0: {}
1997
+ input_ids:0: {}
1998
+ output:
1999
+ 473:0_quant: {}
2000
+ attr:
2001
+ dst_shape: 4,64,-1,-1
2002
+ dims: '0'
2003
+ Add_328:
2004
+ type: InnerProduct
2005
+ input:
2006
+ '618:0': {}
2007
+ 453:0_quant: {}
2008
+ bert.encoder.layer.3.attention.self.value.bias:0: {}
2009
+ 618:0_min: {}
2010
+ 618:0_max: {}
2011
+ 453:0_quant_min: {}
2012
+ 453:0_quant_max: {}
2013
+ Add_328:0_min: {}
2014
+ Add_328:0_max: {}
2015
+ output:
2016
+ Add_328:0: {}
2017
+ attr:
2018
+ output_dtype: s8
2019
+ Reshape_338:
2020
+ type: Reshape
2021
+ input:
2022
+ Add_328:0: {}
2023
+ input_ids:0: {}
2024
+ output:
2025
+ 490:0_quant: {}
2026
+ attr:
2027
+ dst_shape: 4,64,-1,-1
2028
+ dims: '0'
2029
+ Add_314:
2030
+ type: InnerProduct
2031
+ input:
2032
+ '614:0': {}
2033
+ 453:0_quant: {}
2034
+ bert.encoder.layer.3.attention.self.query.bias:0: {}
2035
+ 614:0_min: {}
2036
+ 614:0_max: {}
2037
+ 453:0_quant_min: {}
2038
+ 453:0_quant_max: {}
2039
+ Add_314:0_min: {}
2040
+ Add_314:0_max: {}
2041
+ output:
2042
+ Add_314:0: {}
2043
+ attr:
2044
+ output_dtype: s8
2045
+ Reshape_349:
2046
+ type: Reshape
2047
+ input:
2048
+ Add_314:0: {}
2049
+ input_ids:0: {}
2050
+ output:
2051
+ 505:0_quant: {}
2052
+ attr:
2053
+ dst_shape: 4,64,-1,-1
2054
+ dims: '0'
2055
+ Add_355:
2056
+ type: Matmul
2057
+ input:
2058
+ 505:0_quant: {}
2059
+ 473:0_quant: {}
2060
+ padding_sequence:0: {}
2061
+ 505:0_quant_min: {}
2062
+ 505:0_quant_max: {}
2063
+ 473:0_quant_min: {}
2064
+ 473:0_quant_max: {}
2065
+ 511:0_min: {}
2066
+ 511:0_max: {}
2067
+ output:
2068
+ '511:0': {}
2069
+ attr:
2070
+ src0_perm: 2,0,3,1
2071
+ src1_perm: 2,0,1,3
2072
+ output_scale: 0.125
2073
+ format_any: false
2074
+ append_op: binary_add
2075
+ Softmax_356:
2076
+ type: Softmax
2077
+ input:
2078
+ '511:0': {}
2079
+ 512:0_quant_min: {}
2080
+ 512:0_quant_max: {}
2081
+ output:
2082
+ 512:0_quant: {}
2083
+ attr:
2084
+ output_dtype: u8
2085
+ Transpose_358:
2086
+ type: Matmul
2087
+ input:
2088
+ 512:0_quant: {}
2089
+ 490:0_quant: {}
2090
+ 512:0_quant_min: {}
2091
+ 512:0_quant_max: {}
2092
+ 490:0_quant_min: {}
2093
+ 490:0_quant_max: {}
2094
+ 514:0_min: {}
2095
+ 514:0_max: {}
2096
+ output:
2097
+ '514:0': {}
2098
+ attr:
2099
+ src1_perm: 2,0,3,1
2100
+ dst_perm: 1,3,0,2
2101
+ output_dtype: u8
2102
+ Reshape_368:
2103
+ type: Reshape
2104
+ input:
2105
+ '514:0': {}
2106
+ output:
2107
+ 526:0_quant: {}
2108
+ attr:
2109
+ dst_shape: 256,-1
2110
+ Add_371:
2111
+ type: InnerProduct
2112
+ input:
2113
+ '624:0': {}
2114
+ 526:0_quant: {}
2115
+ bert.encoder.layer.3.attention.output.dense.bias:0: {}
2116
+ '453:0': {}
2117
+ 624:0_min: {}
2118
+ 624:0_max: {}
2119
+ 526:0_quant_min: {}
2120
+ 526:0_quant_max: {}
2121
+ 530:0_min: {}
2122
+ 530:0_max: {}
2123
+ output:
2124
+ '530:0': {}
2125
+ attr:
2126
+ append_op: sum
2127
+ Add_382:
2128
+ type: LayerNorm
2129
+ input:
2130
+ '530:0': {}
2131
+ bert.encoder.layer.3.attention.output.LayerNorm.weight:0: {}
2132
+ bert.encoder.layer.3.attention.output.LayerNorm.bias:0: {}
2133
+ output:
2134
+ '541:0': {}
2135
+ attr:
2136
+ epsilon: 9.999999960041972e-13
2137
+ transpose_mode: 1,0
2138
+ Mul_392_quant_0:
2139
+ type: Quantize
2140
+ input:
2141
+ '541:0': {}
2142
+ 541:0_min: {}
2143
+ 541:0_max: {}
2144
+ output:
2145
+ 541:0_quant: {}
2146
+ attr:
2147
+ output_dtype: u8
2148
+ Mul_392:
2149
+ type: InnerProduct
2150
+ input:
2151
+ '625:0': {}
2152
+ 541:0_quant: {}
2153
+ bert.encoder.layer.3.intermediate.dense.bias:0: {}
2154
+ 625:0_min: {}
2155
+ 625:0_max: {}
2156
+ 541:0_quant_min: {}
2157
+ 541:0_quant_max: {}
2158
+ 552:0_quant_min: {}
2159
+ 552:0_quant_max: {}
2160
+ output:
2161
+ 552:0_quant: {}
2162
+ Mul_392_gelu:
2163
+ type: Gelu
2164
+ input:
2165
+ 552:0_quant: {}
2166
+ output:
2167
+ 552:0_quant_gelu: {}
2168
+ attr:
2169
+ algorithm: gelu_tanh
2170
+ Mul_392_gelu_quant:
2171
+ type: Quantize
2172
+ input:
2173
+ 552:0_quant_gelu: {}
2174
+ 552:0_quant_min: {}
2175
+ 552:0_quant_max: {}
2176
+ output:
2177
+ 552:0_quant_quant: {}
2178
+ attr:
2179
+ output_dtype: u8
2180
+ Add_395:
2181
+ type: InnerProduct
2182
+ input:
2183
+ '626:0': {}
2184
+ 552:0_quant_quant: {}
2185
+ bert.encoder.layer.3.output.dense.bias:0: {}
2186
+ '541:0': {}
2187
+ 626:0_min: {}
2188
+ 626:0_max: {}
2189
+ 552:0_quant_min: {}
2190
+ 552:0_quant_max: {}
2191
+ 556:0_min: {}
2192
+ 556:0_max: {}
2193
+ output:
2194
+ '556:0': {}
2195
+ attr:
2196
+ append_op: sum
2197
+ Add_406_reorder_pre:
2198
+ type: Reorder
2199
+ input:
2200
+ '556:0': {}
2201
+ output:
2202
+ 556:0_reorder: {}
2203
+ attr:
2204
+ src_perm: 0,1
2205
+ dst_perm: 1,0
2206
+ Add_406:
2207
+ type: LayerNorm
2208
+ input:
2209
+ 556:0_reorder: {}
2210
+ bert.encoder.layer.3.output.LayerNorm.weight:0: {}
2211
+ bert.encoder.layer.3.output.LayerNorm.bias:0: {}
2212
+ output:
2213
+ Add_406:0: {}
2214
+ attr:
2215
+ epsilon: 9.999999960041972e-13
2216
+ last_layer_reshape:
2217
+ type: Reshape
2218
+ input:
2219
+ Add_406:0: {}
2220
+ input_ids:0: {}
2221
+ output:
2222
+ last_layer_reshape:0: {}
2223
+ attr:
2224
+ dst_shape: -1,-1,256
2225
+ dims: 0,1
2226
+ last_layer_strided_slice:
2227
+ type: StridedSlice
2228
+ input:
2229
+ last_layer_reshape:0: {}
2230
+ output:
2231
+ last_layer_strided_slice:0: {}
2232
+ attr:
2233
+ begin_mask: 5
2234
+ ellipsis_mask: 0
2235
+ end_mask: 5
2236
+ new_axis_mask: 0
2237
+ shrink_axis_mask: 0
2238
+ begin: 0,0,0
2239
+ end: 0,1,0
2240
+ strides: 1,1,1
2241
+ Gather_408:
2242
+ type: Reshape
2243
+ input:
2244
+ last_layer_strided_slice:0: {}
2245
+ output:
2246
+ '569:0': {}
2247
+ attr:
2248
+ dst_shape: -1,256
2249
+ Tanh_410_quant_0:
2250
+ type: Quantize
2251
+ input:
2252
+ '569:0': {}
2253
+ 569:0_min: {}
2254
+ 569:0_max: {}
2255
+ output:
2256
+ 569:0_quant: {}
2257
+ attr:
2258
+ output_dtype: u8
2259
+ Tanh_410:
2260
+ type: InnerProduct
2261
+ input:
2262
+ 569:0_quant: {}
2263
+ bert.pooler.dense.weight:0: {}
2264
+ bert.pooler.dense.bias:0: {}
2265
+ 569:0_quant_min: {}
2266
+ 569:0_quant_max: {}
2267
+ bert.pooler.dense.weight:0_min: {}
2268
+ bert.pooler.dense.weight:0_max: {}
2269
+ 571:0_quant_min: {}
2270
+ 571:0_quant_max: {}
2271
+ output:
2272
+ 571:0_quant: {}
2273
+ attr:
2274
+ src1_perm: 0,1
2275
+ append_op: tanh
2276
+ output_dtype: u8
2277
+ Gemm_411:
2278
+ type: InnerProduct
2279
+ input:
2280
+ 571:0_quant: {}
2281
+ classifier.weight:0: {}
2282
+ classifier.bias:0: {}
2283
+ 571:0_quant_min: {}
2284
+ 571:0_quant_max: {}
2285
+ classifier.weight:0_min: {}
2286
+ classifier.weight:0_max: {}
2287
+ output:0_min: {}
2288
+ output:0_max: {}
2289
+ output:
2290
+ output:0: {}
2291
+ attr:
2292
+ src1_perm: 0,1out
2293
+ output_data:
2294
+ type: Output
2295
+ input:
2296
+ output:0: {}
2297
+ #'199:0': {}
2298
+ #'188:0': {}
2299
+ #184:0_quant: {}
Neural_Engine_INT8_IR/model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:65046bc15bbc90be9917710793edf85d92dcfd1df7bea5da2c9969fd17327ae4
3
+ size 35125504
README.md CHANGED
@@ -1,3 +1,21 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+
5
+ # Sparse BERT mini model (uncased)
6
+
7
+ Finetuned model pruned to 1:4 structured sparsity.
8
+ The model is a pruned version of the [BERT mini model](https://huggingface.co/prajjwal1/bert-mini).
9
+
10
+ ## Intended Use
11
+
12
+ The model can be used for inference with sparsity optimisztion.
13
+ For further details on the model and its usage, see our repo and our implementation available [here](https://github.com/intel-innersource/frameworks.ai.nlp-toolkit.intel-nlp-toolkit).
14
+ We also upload the quanted int8 BERT mini sparse Neural Engine IR (acc 87.15) here, could be directly used by NLP Toolkit ref inference.
15
+
16
+ ## Evaluation Results
17
+ We get the following results on the sst2 tasks development set:
18
+
19
+ | Task | SST-2 (Acc) |
20
+ |------|-------------|
21
+ | | 87.2 |
config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "Intel/bert-mini-sst2-distilled-sparse-90-1X4-block",
3
+ "architectures": [
4
+ "BertForSequenceClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "finetuning_task": "sst2",
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 256,
12
+ "id2label": {
13
+ "0": "0",
14
+ "1": "1"
15
+ },
16
+ "initializer_range": 0.02,
17
+ "intermediate_size": 1024,
18
+ "label2id": {
19
+ "0": 0,
20
+ "1": 1
21
+ },
22
+ "layer_norm_eps": 1e-12,
23
+ "max_position_embeddings": 512,
24
+ "model_type": "bert",
25
+ "num_attention_heads": 4,
26
+ "num_hidden_layers": 4,
27
+ "pad_token_id": 0,
28
+ "position_embedding_type": "absolute",
29
+ "problem_type": "single_label_classification",
30
+ "torch_dtype": "float32",
31
+ "transformers_version": "4.16.0",
32
+ "type_vocab_size": 2,
33
+ "use_cache": true,
34
+ "vocab_size": 30522
35
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e3faaf7afce1767fab009b3dd9c095ff0495217d350f04a804a565bad25b9ab5
3
+ size 44717063
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
1
+ {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
1
+ {"do_lower_case": true, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "special_tokens_map_file": null, "name_or_path": "google/bert_uncased_L-4_H-256_A-4", "do_basic_tokenize": true, "never_split": null, "tokenizer_class": "BertTokenizer"}
vocab.txt ADDED
The diff for this file is too large to render. See raw diff