zpn commited on
Commit
487b330
1 Parent(s): b49751a

Create base_shapes.bsh

Browse files
Files changed (1) hide show
  1. base_shapes.bsh +939 -0
base_shapes.bsh ADDED
@@ -0,0 +1,939 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # This is a base shape file encoded in yaml
2
+ # - `null` indicates a dimension is "finite", i.e. a non-"width" dimension
3
+ # - a number indicates the base dimension of an "infinite" dimension, i.e. some notion of "width"
4
+ bert.embeddings.norm.bias:
5
+ - 64
6
+ bert.embeddings.norm.weight:
7
+ - 64
8
+ bert.embeddings.position_embeddings.weight:
9
+ - null
10
+ - 64
11
+ bert.embeddings.token_type_embeddings.weight:
12
+ - null
13
+ - 64
14
+ bert.embeddings.word_embeddings.weight:
15
+ - null
16
+ - 64
17
+ bert.encoder.layer.0.attention.output.dense.bias:
18
+ - 64
19
+ bert.encoder.layer.0.attention.output.dense.weight:
20
+ - 64
21
+ - 64
22
+ bert.encoder.layer.0.attention.output.norm.bias:
23
+ - 64
24
+ bert.encoder.layer.0.attention.output.norm.weight:
25
+ - 64
26
+ bert.encoder.layer.0.attention.self.key.bias:
27
+ - 64
28
+ bert.encoder.layer.0.attention.self.key.weight:
29
+ - 64
30
+ - 64
31
+ bert.encoder.layer.0.attention.self.query.bias:
32
+ - 64
33
+ bert.encoder.layer.0.attention.self.query.weight:
34
+ - 64
35
+ - 64
36
+ bert.encoder.layer.0.attention.self.value.bias:
37
+ - 64
38
+ bert.encoder.layer.0.attention.self.value.weight:
39
+ - 64
40
+ - 64
41
+ bert.encoder.layer.0.intermediate.dense.bias:
42
+ - 256
43
+ bert.encoder.layer.0.intermediate.dense.weight:
44
+ - 256
45
+ - 64
46
+ bert.encoder.layer.0.output.LayerNorm.bias:
47
+ - 64
48
+ bert.encoder.layer.0.output.LayerNorm.weight:
49
+ - 64
50
+ bert.encoder.layer.0.output.dense.bias:
51
+ - 64
52
+ bert.encoder.layer.0.output.dense.weight:
53
+ - 64
54
+ - 256
55
+ bert.encoder.layer.1.attention.output.dense.bias:
56
+ - 64
57
+ bert.encoder.layer.1.attention.output.dense.weight:
58
+ - 64
59
+ - 64
60
+ bert.encoder.layer.1.attention.output.norm.bias:
61
+ - 64
62
+ bert.encoder.layer.1.attention.output.norm.weight:
63
+ - 64
64
+ bert.encoder.layer.1.attention.self.key.bias:
65
+ - 64
66
+ bert.encoder.layer.1.attention.self.key.weight:
67
+ - 64
68
+ - 64
69
+ bert.encoder.layer.1.attention.self.query.bias:
70
+ - 64
71
+ bert.encoder.layer.1.attention.self.query.weight:
72
+ - 64
73
+ - 64
74
+ bert.encoder.layer.1.attention.self.value.bias:
75
+ - 64
76
+ bert.encoder.layer.1.attention.self.value.weight:
77
+ - 64
78
+ - 64
79
+ bert.encoder.layer.1.intermediate.dense.bias:
80
+ - 256
81
+ bert.encoder.layer.1.intermediate.dense.weight:
82
+ - 256
83
+ - 64
84
+ bert.encoder.layer.1.output.LayerNorm.bias:
85
+ - 64
86
+ bert.encoder.layer.1.output.LayerNorm.weight:
87
+ - 64
88
+ bert.encoder.layer.1.output.dense.bias:
89
+ - 64
90
+ bert.encoder.layer.1.output.dense.weight:
91
+ - 64
92
+ - 256
93
+ bert.encoder.layer.10.attention.output.dense.bias:
94
+ - 64
95
+ bert.encoder.layer.10.attention.output.dense.weight:
96
+ - 64
97
+ - 64
98
+ bert.encoder.layer.10.attention.output.norm.bias:
99
+ - 64
100
+ bert.encoder.layer.10.attention.output.norm.weight:
101
+ - 64
102
+ bert.encoder.layer.10.attention.self.key.bias:
103
+ - 64
104
+ bert.encoder.layer.10.attention.self.key.weight:
105
+ - 64
106
+ - 64
107
+ bert.encoder.layer.10.attention.self.query.bias:
108
+ - 64
109
+ bert.encoder.layer.10.attention.self.query.weight:
110
+ - 64
111
+ - 64
112
+ bert.encoder.layer.10.attention.self.value.bias:
113
+ - 64
114
+ bert.encoder.layer.10.attention.self.value.weight:
115
+ - 64
116
+ - 64
117
+ bert.encoder.layer.10.intermediate.dense.bias:
118
+ - 256
119
+ bert.encoder.layer.10.intermediate.dense.weight:
120
+ - 256
121
+ - 64
122
+ bert.encoder.layer.10.output.LayerNorm.bias:
123
+ - 64
124
+ bert.encoder.layer.10.output.LayerNorm.weight:
125
+ - 64
126
+ bert.encoder.layer.10.output.dense.bias:
127
+ - 64
128
+ bert.encoder.layer.10.output.dense.weight:
129
+ - 64
130
+ - 256
131
+ bert.encoder.layer.11.attention.output.dense.bias:
132
+ - 64
133
+ bert.encoder.layer.11.attention.output.dense.weight:
134
+ - 64
135
+ - 64
136
+ bert.encoder.layer.11.attention.output.norm.bias:
137
+ - 64
138
+ bert.encoder.layer.11.attention.output.norm.weight:
139
+ - 64
140
+ bert.encoder.layer.11.attention.self.key.bias:
141
+ - 64
142
+ bert.encoder.layer.11.attention.self.key.weight:
143
+ - 64
144
+ - 64
145
+ bert.encoder.layer.11.attention.self.query.bias:
146
+ - 64
147
+ bert.encoder.layer.11.attention.self.query.weight:
148
+ - 64
149
+ - 64
150
+ bert.encoder.layer.11.attention.self.value.bias:
151
+ - 64
152
+ bert.encoder.layer.11.attention.self.value.weight:
153
+ - 64
154
+ - 64
155
+ bert.encoder.layer.11.intermediate.dense.bias:
156
+ - 256
157
+ bert.encoder.layer.11.intermediate.dense.weight:
158
+ - 256
159
+ - 64
160
+ bert.encoder.layer.11.output.LayerNorm.bias:
161
+ - 64
162
+ bert.encoder.layer.11.output.LayerNorm.weight:
163
+ - 64
164
+ bert.encoder.layer.11.output.dense.bias:
165
+ - 64
166
+ bert.encoder.layer.11.output.dense.weight:
167
+ - 64
168
+ - 256
169
+ bert.encoder.layer.12.attention.output.dense.bias:
170
+ - 64
171
+ bert.encoder.layer.12.attention.output.dense.weight:
172
+ - 64
173
+ - 64
174
+ bert.encoder.layer.12.attention.output.norm.bias:
175
+ - 64
176
+ bert.encoder.layer.12.attention.output.norm.weight:
177
+ - 64
178
+ bert.encoder.layer.12.attention.self.key.bias:
179
+ - 64
180
+ bert.encoder.layer.12.attention.self.key.weight:
181
+ - 64
182
+ - 64
183
+ bert.encoder.layer.12.attention.self.query.bias:
184
+ - 64
185
+ bert.encoder.layer.12.attention.self.query.weight:
186
+ - 64
187
+ - 64
188
+ bert.encoder.layer.12.attention.self.value.bias:
189
+ - 64
190
+ bert.encoder.layer.12.attention.self.value.weight:
191
+ - 64
192
+ - 64
193
+ bert.encoder.layer.12.intermediate.dense.bias:
194
+ - 256
195
+ bert.encoder.layer.12.intermediate.dense.weight:
196
+ - 256
197
+ - 64
198
+ bert.encoder.layer.12.output.LayerNorm.bias:
199
+ - 64
200
+ bert.encoder.layer.12.output.LayerNorm.weight:
201
+ - 64
202
+ bert.encoder.layer.12.output.dense.bias:
203
+ - 64
204
+ bert.encoder.layer.12.output.dense.weight:
205
+ - 64
206
+ - 256
207
+ bert.encoder.layer.13.attention.output.dense.bias:
208
+ - 64
209
+ bert.encoder.layer.13.attention.output.dense.weight:
210
+ - 64
211
+ - 64
212
+ bert.encoder.layer.13.attention.output.norm.bias:
213
+ - 64
214
+ bert.encoder.layer.13.attention.output.norm.weight:
215
+ - 64
216
+ bert.encoder.layer.13.attention.self.key.bias:
217
+ - 64
218
+ bert.encoder.layer.13.attention.self.key.weight:
219
+ - 64
220
+ - 64
221
+ bert.encoder.layer.13.attention.self.query.bias:
222
+ - 64
223
+ bert.encoder.layer.13.attention.self.query.weight:
224
+ - 64
225
+ - 64
226
+ bert.encoder.layer.13.attention.self.value.bias:
227
+ - 64
228
+ bert.encoder.layer.13.attention.self.value.weight:
229
+ - 64
230
+ - 64
231
+ bert.encoder.layer.13.intermediate.dense.bias:
232
+ - 256
233
+ bert.encoder.layer.13.intermediate.dense.weight:
234
+ - 256
235
+ - 64
236
+ bert.encoder.layer.13.output.LayerNorm.bias:
237
+ - 64
238
+ bert.encoder.layer.13.output.LayerNorm.weight:
239
+ - 64
240
+ bert.encoder.layer.13.output.dense.bias:
241
+ - 64
242
+ bert.encoder.layer.13.output.dense.weight:
243
+ - 64
244
+ - 256
245
+ bert.encoder.layer.14.attention.output.dense.bias:
246
+ - 64
247
+ bert.encoder.layer.14.attention.output.dense.weight:
248
+ - 64
249
+ - 64
250
+ bert.encoder.layer.14.attention.output.norm.bias:
251
+ - 64
252
+ bert.encoder.layer.14.attention.output.norm.weight:
253
+ - 64
254
+ bert.encoder.layer.14.attention.self.key.bias:
255
+ - 64
256
+ bert.encoder.layer.14.attention.self.key.weight:
257
+ - 64
258
+ - 64
259
+ bert.encoder.layer.14.attention.self.query.bias:
260
+ - 64
261
+ bert.encoder.layer.14.attention.self.query.weight:
262
+ - 64
263
+ - 64
264
+ bert.encoder.layer.14.attention.self.value.bias:
265
+ - 64
266
+ bert.encoder.layer.14.attention.self.value.weight:
267
+ - 64
268
+ - 64
269
+ bert.encoder.layer.14.intermediate.dense.bias:
270
+ - 256
271
+ bert.encoder.layer.14.intermediate.dense.weight:
272
+ - 256
273
+ - 64
274
+ bert.encoder.layer.14.output.LayerNorm.bias:
275
+ - 64
276
+ bert.encoder.layer.14.output.LayerNorm.weight:
277
+ - 64
278
+ bert.encoder.layer.14.output.dense.bias:
279
+ - 64
280
+ bert.encoder.layer.14.output.dense.weight:
281
+ - 64
282
+ - 256
283
+ bert.encoder.layer.15.attention.output.dense.bias:
284
+ - 64
285
+ bert.encoder.layer.15.attention.output.dense.weight:
286
+ - 64
287
+ - 64
288
+ bert.encoder.layer.15.attention.output.norm.bias:
289
+ - 64
290
+ bert.encoder.layer.15.attention.output.norm.weight:
291
+ - 64
292
+ bert.encoder.layer.15.attention.self.key.bias:
293
+ - 64
294
+ bert.encoder.layer.15.attention.self.key.weight:
295
+ - 64
296
+ - 64
297
+ bert.encoder.layer.15.attention.self.query.bias:
298
+ - 64
299
+ bert.encoder.layer.15.attention.self.query.weight:
300
+ - 64
301
+ - 64
302
+ bert.encoder.layer.15.attention.self.value.bias:
303
+ - 64
304
+ bert.encoder.layer.15.attention.self.value.weight:
305
+ - 64
306
+ - 64
307
+ bert.encoder.layer.15.intermediate.dense.bias:
308
+ - 256
309
+ bert.encoder.layer.15.intermediate.dense.weight:
310
+ - 256
311
+ - 64
312
+ bert.encoder.layer.15.output.LayerNorm.bias:
313
+ - 64
314
+ bert.encoder.layer.15.output.LayerNorm.weight:
315
+ - 64
316
+ bert.encoder.layer.15.output.dense.bias:
317
+ - 64
318
+ bert.encoder.layer.15.output.dense.weight:
319
+ - 64
320
+ - 256
321
+ bert.encoder.layer.16.attention.output.dense.bias:
322
+ - 64
323
+ bert.encoder.layer.16.attention.output.dense.weight:
324
+ - 64
325
+ - 64
326
+ bert.encoder.layer.16.attention.output.norm.bias:
327
+ - 64
328
+ bert.encoder.layer.16.attention.output.norm.weight:
329
+ - 64
330
+ bert.encoder.layer.16.attention.self.key.bias:
331
+ - 64
332
+ bert.encoder.layer.16.attention.self.key.weight:
333
+ - 64
334
+ - 64
335
+ bert.encoder.layer.16.attention.self.query.bias:
336
+ - 64
337
+ bert.encoder.layer.16.attention.self.query.weight:
338
+ - 64
339
+ - 64
340
+ bert.encoder.layer.16.attention.self.value.bias:
341
+ - 64
342
+ bert.encoder.layer.16.attention.self.value.weight:
343
+ - 64
344
+ - 64
345
+ bert.encoder.layer.16.intermediate.dense.bias:
346
+ - 256
347
+ bert.encoder.layer.16.intermediate.dense.weight:
348
+ - 256
349
+ - 64
350
+ bert.encoder.layer.16.output.LayerNorm.bias:
351
+ - 64
352
+ bert.encoder.layer.16.output.LayerNorm.weight:
353
+ - 64
354
+ bert.encoder.layer.16.output.dense.bias:
355
+ - 64
356
+ bert.encoder.layer.16.output.dense.weight:
357
+ - 64
358
+ - 256
359
+ bert.encoder.layer.17.attention.output.dense.bias:
360
+ - 64
361
+ bert.encoder.layer.17.attention.output.dense.weight:
362
+ - 64
363
+ - 64
364
+ bert.encoder.layer.17.attention.output.norm.bias:
365
+ - 64
366
+ bert.encoder.layer.17.attention.output.norm.weight:
367
+ - 64
368
+ bert.encoder.layer.17.attention.self.key.bias:
369
+ - 64
370
+ bert.encoder.layer.17.attention.self.key.weight:
371
+ - 64
372
+ - 64
373
+ bert.encoder.layer.17.attention.self.query.bias:
374
+ - 64
375
+ bert.encoder.layer.17.attention.self.query.weight:
376
+ - 64
377
+ - 64
378
+ bert.encoder.layer.17.attention.self.value.bias:
379
+ - 64
380
+ bert.encoder.layer.17.attention.self.value.weight:
381
+ - 64
382
+ - 64
383
+ bert.encoder.layer.17.intermediate.dense.bias:
384
+ - 256
385
+ bert.encoder.layer.17.intermediate.dense.weight:
386
+ - 256
387
+ - 64
388
+ bert.encoder.layer.17.output.LayerNorm.bias:
389
+ - 64
390
+ bert.encoder.layer.17.output.LayerNorm.weight:
391
+ - 64
392
+ bert.encoder.layer.17.output.dense.bias:
393
+ - 64
394
+ bert.encoder.layer.17.output.dense.weight:
395
+ - 64
396
+ - 256
397
+ bert.encoder.layer.18.attention.output.dense.bias:
398
+ - 64
399
+ bert.encoder.layer.18.attention.output.dense.weight:
400
+ - 64
401
+ - 64
402
+ bert.encoder.layer.18.attention.output.norm.bias:
403
+ - 64
404
+ bert.encoder.layer.18.attention.output.norm.weight:
405
+ - 64
406
+ bert.encoder.layer.18.attention.self.key.bias:
407
+ - 64
408
+ bert.encoder.layer.18.attention.self.key.weight:
409
+ - 64
410
+ - 64
411
+ bert.encoder.layer.18.attention.self.query.bias:
412
+ - 64
413
+ bert.encoder.layer.18.attention.self.query.weight:
414
+ - 64
415
+ - 64
416
+ bert.encoder.layer.18.attention.self.value.bias:
417
+ - 64
418
+ bert.encoder.layer.18.attention.self.value.weight:
419
+ - 64
420
+ - 64
421
+ bert.encoder.layer.18.intermediate.dense.bias:
422
+ - 256
423
+ bert.encoder.layer.18.intermediate.dense.weight:
424
+ - 256
425
+ - 64
426
+ bert.encoder.layer.18.output.LayerNorm.bias:
427
+ - 64
428
+ bert.encoder.layer.18.output.LayerNorm.weight:
429
+ - 64
430
+ bert.encoder.layer.18.output.dense.bias:
431
+ - 64
432
+ bert.encoder.layer.18.output.dense.weight:
433
+ - 64
434
+ - 256
435
+ bert.encoder.layer.19.attention.output.dense.bias:
436
+ - 64
437
+ bert.encoder.layer.19.attention.output.dense.weight:
438
+ - 64
439
+ - 64
440
+ bert.encoder.layer.19.attention.output.norm.bias:
441
+ - 64
442
+ bert.encoder.layer.19.attention.output.norm.weight:
443
+ - 64
444
+ bert.encoder.layer.19.attention.self.key.bias:
445
+ - 64
446
+ bert.encoder.layer.19.attention.self.key.weight:
447
+ - 64
448
+ - 64
449
+ bert.encoder.layer.19.attention.self.query.bias:
450
+ - 64
451
+ bert.encoder.layer.19.attention.self.query.weight:
452
+ - 64
453
+ - 64
454
+ bert.encoder.layer.19.attention.self.value.bias:
455
+ - 64
456
+ bert.encoder.layer.19.attention.self.value.weight:
457
+ - 64
458
+ - 64
459
+ bert.encoder.layer.19.intermediate.dense.bias:
460
+ - 256
461
+ bert.encoder.layer.19.intermediate.dense.weight:
462
+ - 256
463
+ - 64
464
+ bert.encoder.layer.19.output.LayerNorm.bias:
465
+ - 64
466
+ bert.encoder.layer.19.output.LayerNorm.weight:
467
+ - 64
468
+ bert.encoder.layer.19.output.dense.bias:
469
+ - 64
470
+ bert.encoder.layer.19.output.dense.weight:
471
+ - 64
472
+ - 256
473
+ bert.encoder.layer.2.attention.output.dense.bias:
474
+ - 64
475
+ bert.encoder.layer.2.attention.output.dense.weight:
476
+ - 64
477
+ - 64
478
+ bert.encoder.layer.2.attention.output.norm.bias:
479
+ - 64
480
+ bert.encoder.layer.2.attention.output.norm.weight:
481
+ - 64
482
+ bert.encoder.layer.2.attention.self.key.bias:
483
+ - 64
484
+ bert.encoder.layer.2.attention.self.key.weight:
485
+ - 64
486
+ - 64
487
+ bert.encoder.layer.2.attention.self.query.bias:
488
+ - 64
489
+ bert.encoder.layer.2.attention.self.query.weight:
490
+ - 64
491
+ - 64
492
+ bert.encoder.layer.2.attention.self.value.bias:
493
+ - 64
494
+ bert.encoder.layer.2.attention.self.value.weight:
495
+ - 64
496
+ - 64
497
+ bert.encoder.layer.2.intermediate.dense.bias:
498
+ - 256
499
+ bert.encoder.layer.2.intermediate.dense.weight:
500
+ - 256
501
+ - 64
502
+ bert.encoder.layer.2.output.LayerNorm.bias:
503
+ - 64
504
+ bert.encoder.layer.2.output.LayerNorm.weight:
505
+ - 64
506
+ bert.encoder.layer.2.output.dense.bias:
507
+ - 64
508
+ bert.encoder.layer.2.output.dense.weight:
509
+ - 64
510
+ - 256
511
+ bert.encoder.layer.20.attention.output.dense.bias:
512
+ - 64
513
+ bert.encoder.layer.20.attention.output.dense.weight:
514
+ - 64
515
+ - 64
516
+ bert.encoder.layer.20.attention.output.norm.bias:
517
+ - 64
518
+ bert.encoder.layer.20.attention.output.norm.weight:
519
+ - 64
520
+ bert.encoder.layer.20.attention.self.key.bias:
521
+ - 64
522
+ bert.encoder.layer.20.attention.self.key.weight:
523
+ - 64
524
+ - 64
525
+ bert.encoder.layer.20.attention.self.query.bias:
526
+ - 64
527
+ bert.encoder.layer.20.attention.self.query.weight:
528
+ - 64
529
+ - 64
530
+ bert.encoder.layer.20.attention.self.value.bias:
531
+ - 64
532
+ bert.encoder.layer.20.attention.self.value.weight:
533
+ - 64
534
+ - 64
535
+ bert.encoder.layer.20.intermediate.dense.bias:
536
+ - 256
537
+ bert.encoder.layer.20.intermediate.dense.weight:
538
+ - 256
539
+ - 64
540
+ bert.encoder.layer.20.output.LayerNorm.bias:
541
+ - 64
542
+ bert.encoder.layer.20.output.LayerNorm.weight:
543
+ - 64
544
+ bert.encoder.layer.20.output.dense.bias:
545
+ - 64
546
+ bert.encoder.layer.20.output.dense.weight:
547
+ - 64
548
+ - 256
549
+ bert.encoder.layer.21.attention.output.dense.bias:
550
+ - 64
551
+ bert.encoder.layer.21.attention.output.dense.weight:
552
+ - 64
553
+ - 64
554
+ bert.encoder.layer.21.attention.output.norm.bias:
555
+ - 64
556
+ bert.encoder.layer.21.attention.output.norm.weight:
557
+ - 64
558
+ bert.encoder.layer.21.attention.self.key.bias:
559
+ - 64
560
+ bert.encoder.layer.21.attention.self.key.weight:
561
+ - 64
562
+ - 64
563
+ bert.encoder.layer.21.attention.self.query.bias:
564
+ - 64
565
+ bert.encoder.layer.21.attention.self.query.weight:
566
+ - 64
567
+ - 64
568
+ bert.encoder.layer.21.attention.self.value.bias:
569
+ - 64
570
+ bert.encoder.layer.21.attention.self.value.weight:
571
+ - 64
572
+ - 64
573
+ bert.encoder.layer.21.intermediate.dense.bias:
574
+ - 256
575
+ bert.encoder.layer.21.intermediate.dense.weight:
576
+ - 256
577
+ - 64
578
+ bert.encoder.layer.21.output.LayerNorm.bias:
579
+ - 64
580
+ bert.encoder.layer.21.output.LayerNorm.weight:
581
+ - 64
582
+ bert.encoder.layer.21.output.dense.bias:
583
+ - 64
584
+ bert.encoder.layer.21.output.dense.weight:
585
+ - 64
586
+ - 256
587
+ bert.encoder.layer.22.attention.output.dense.bias:
588
+ - 64
589
+ bert.encoder.layer.22.attention.output.dense.weight:
590
+ - 64
591
+ - 64
592
+ bert.encoder.layer.22.attention.output.norm.bias:
593
+ - 64
594
+ bert.encoder.layer.22.attention.output.norm.weight:
595
+ - 64
596
+ bert.encoder.layer.22.attention.self.key.bias:
597
+ - 64
598
+ bert.encoder.layer.22.attention.self.key.weight:
599
+ - 64
600
+ - 64
601
+ bert.encoder.layer.22.attention.self.query.bias:
602
+ - 64
603
+ bert.encoder.layer.22.attention.self.query.weight:
604
+ - 64
605
+ - 64
606
+ bert.encoder.layer.22.attention.self.value.bias:
607
+ - 64
608
+ bert.encoder.layer.22.attention.self.value.weight:
609
+ - 64
610
+ - 64
611
+ bert.encoder.layer.22.intermediate.dense.bias:
612
+ - 256
613
+ bert.encoder.layer.22.intermediate.dense.weight:
614
+ - 256
615
+ - 64
616
+ bert.encoder.layer.22.output.LayerNorm.bias:
617
+ - 64
618
+ bert.encoder.layer.22.output.LayerNorm.weight:
619
+ - 64
620
+ bert.encoder.layer.22.output.dense.bias:
621
+ - 64
622
+ bert.encoder.layer.22.output.dense.weight:
623
+ - 64
624
+ - 256
625
+ bert.encoder.layer.23.attention.output.dense.bias:
626
+ - 64
627
+ bert.encoder.layer.23.attention.output.dense.weight:
628
+ - 64
629
+ - 64
630
+ bert.encoder.layer.23.attention.output.norm.bias:
631
+ - 64
632
+ bert.encoder.layer.23.attention.output.norm.weight:
633
+ - 64
634
+ bert.encoder.layer.23.attention.self.key.bias:
635
+ - 64
636
+ bert.encoder.layer.23.attention.self.key.weight:
637
+ - 64
638
+ - 64
639
+ bert.encoder.layer.23.attention.self.query.bias:
640
+ - 64
641
+ bert.encoder.layer.23.attention.self.query.weight:
642
+ - 64
643
+ - 64
644
+ bert.encoder.layer.23.attention.self.value.bias:
645
+ - 64
646
+ bert.encoder.layer.23.attention.self.value.weight:
647
+ - 64
648
+ - 64
649
+ bert.encoder.layer.23.intermediate.dense.bias:
650
+ - 256
651
+ bert.encoder.layer.23.intermediate.dense.weight:
652
+ - 256
653
+ - 64
654
+ bert.encoder.layer.23.output.LayerNorm.bias:
655
+ - 64
656
+ bert.encoder.layer.23.output.LayerNorm.weight:
657
+ - 64
658
+ bert.encoder.layer.23.output.dense.bias:
659
+ - 64
660
+ bert.encoder.layer.23.output.dense.weight:
661
+ - 64
662
+ - 256
663
+ bert.encoder.layer.3.attention.output.dense.bias:
664
+ - 64
665
+ bert.encoder.layer.3.attention.output.dense.weight:
666
+ - 64
667
+ - 64
668
+ bert.encoder.layer.3.attention.output.norm.bias:
669
+ - 64
670
+ bert.encoder.layer.3.attention.output.norm.weight:
671
+ - 64
672
+ bert.encoder.layer.3.attention.self.key.bias:
673
+ - 64
674
+ bert.encoder.layer.3.attention.self.key.weight:
675
+ - 64
676
+ - 64
677
+ bert.encoder.layer.3.attention.self.query.bias:
678
+ - 64
679
+ bert.encoder.layer.3.attention.self.query.weight:
680
+ - 64
681
+ - 64
682
+ bert.encoder.layer.3.attention.self.value.bias:
683
+ - 64
684
+ bert.encoder.layer.3.attention.self.value.weight:
685
+ - 64
686
+ - 64
687
+ bert.encoder.layer.3.intermediate.dense.bias:
688
+ - 256
689
+ bert.encoder.layer.3.intermediate.dense.weight:
690
+ - 256
691
+ - 64
692
+ bert.encoder.layer.3.output.LayerNorm.bias:
693
+ - 64
694
+ bert.encoder.layer.3.output.LayerNorm.weight:
695
+ - 64
696
+ bert.encoder.layer.3.output.dense.bias:
697
+ - 64
698
+ bert.encoder.layer.3.output.dense.weight:
699
+ - 64
700
+ - 256
701
+ bert.encoder.layer.4.attention.output.dense.bias:
702
+ - 64
703
+ bert.encoder.layer.4.attention.output.dense.weight:
704
+ - 64
705
+ - 64
706
+ bert.encoder.layer.4.attention.output.norm.bias:
707
+ - 64
708
+ bert.encoder.layer.4.attention.output.norm.weight:
709
+ - 64
710
+ bert.encoder.layer.4.attention.self.key.bias:
711
+ - 64
712
+ bert.encoder.layer.4.attention.self.key.weight:
713
+ - 64
714
+ - 64
715
+ bert.encoder.layer.4.attention.self.query.bias:
716
+ - 64
717
+ bert.encoder.layer.4.attention.self.query.weight:
718
+ - 64
719
+ - 64
720
+ bert.encoder.layer.4.attention.self.value.bias:
721
+ - 64
722
+ bert.encoder.layer.4.attention.self.value.weight:
723
+ - 64
724
+ - 64
725
+ bert.encoder.layer.4.intermediate.dense.bias:
726
+ - 256
727
+ bert.encoder.layer.4.intermediate.dense.weight:
728
+ - 256
729
+ - 64
730
+ bert.encoder.layer.4.output.LayerNorm.bias:
731
+ - 64
732
+ bert.encoder.layer.4.output.LayerNorm.weight:
733
+ - 64
734
+ bert.encoder.layer.4.output.dense.bias:
735
+ - 64
736
+ bert.encoder.layer.4.output.dense.weight:
737
+ - 64
738
+ - 256
739
+ bert.encoder.layer.5.attention.output.dense.bias:
740
+ - 64
741
+ bert.encoder.layer.5.attention.output.dense.weight:
742
+ - 64
743
+ - 64
744
+ bert.encoder.layer.5.attention.output.norm.bias:
745
+ - 64
746
+ bert.encoder.layer.5.attention.output.norm.weight:
747
+ - 64
748
+ bert.encoder.layer.5.attention.self.key.bias:
749
+ - 64
750
+ bert.encoder.layer.5.attention.self.key.weight:
751
+ - 64
752
+ - 64
753
+ bert.encoder.layer.5.attention.self.query.bias:
754
+ - 64
755
+ bert.encoder.layer.5.attention.self.query.weight:
756
+ - 64
757
+ - 64
758
+ bert.encoder.layer.5.attention.self.value.bias:
759
+ - 64
760
+ bert.encoder.layer.5.attention.self.value.weight:
761
+ - 64
762
+ - 64
763
+ bert.encoder.layer.5.intermediate.dense.bias:
764
+ - 256
765
+ bert.encoder.layer.5.intermediate.dense.weight:
766
+ - 256
767
+ - 64
768
+ bert.encoder.layer.5.output.LayerNorm.bias:
769
+ - 64
770
+ bert.encoder.layer.5.output.LayerNorm.weight:
771
+ - 64
772
+ bert.encoder.layer.5.output.dense.bias:
773
+ - 64
774
+ bert.encoder.layer.5.output.dense.weight:
775
+ - 64
776
+ - 256
777
+ bert.encoder.layer.6.attention.output.dense.bias:
778
+ - 64
779
+ bert.encoder.layer.6.attention.output.dense.weight:
780
+ - 64
781
+ - 64
782
+ bert.encoder.layer.6.attention.output.norm.bias:
783
+ - 64
784
+ bert.encoder.layer.6.attention.output.norm.weight:
785
+ - 64
786
+ bert.encoder.layer.6.attention.self.key.bias:
787
+ - 64
788
+ bert.encoder.layer.6.attention.self.key.weight:
789
+ - 64
790
+ - 64
791
+ bert.encoder.layer.6.attention.self.query.bias:
792
+ - 64
793
+ bert.encoder.layer.6.attention.self.query.weight:
794
+ - 64
795
+ - 64
796
+ bert.encoder.layer.6.attention.self.value.bias:
797
+ - 64
798
+ bert.encoder.layer.6.attention.self.value.weight:
799
+ - 64
800
+ - 64
801
+ bert.encoder.layer.6.intermediate.dense.bias:
802
+ - 256
803
+ bert.encoder.layer.6.intermediate.dense.weight:
804
+ - 256
805
+ - 64
806
+ bert.encoder.layer.6.output.LayerNorm.bias:
807
+ - 64
808
+ bert.encoder.layer.6.output.LayerNorm.weight:
809
+ - 64
810
+ bert.encoder.layer.6.output.dense.bias:
811
+ - 64
812
+ bert.encoder.layer.6.output.dense.weight:
813
+ - 64
814
+ - 256
815
+ bert.encoder.layer.7.attention.output.dense.bias:
816
+ - 64
817
+ bert.encoder.layer.7.attention.output.dense.weight:
818
+ - 64
819
+ - 64
820
+ bert.encoder.layer.7.attention.output.norm.bias:
821
+ - 64
822
+ bert.encoder.layer.7.attention.output.norm.weight:
823
+ - 64
824
+ bert.encoder.layer.7.attention.self.key.bias:
825
+ - 64
826
+ bert.encoder.layer.7.attention.self.key.weight:
827
+ - 64
828
+ - 64
829
+ bert.encoder.layer.7.attention.self.query.bias:
830
+ - 64
831
+ bert.encoder.layer.7.attention.self.query.weight:
832
+ - 64
833
+ - 64
834
+ bert.encoder.layer.7.attention.self.value.bias:
835
+ - 64
836
+ bert.encoder.layer.7.attention.self.value.weight:
837
+ - 64
838
+ - 64
839
+ bert.encoder.layer.7.intermediate.dense.bias:
840
+ - 256
841
+ bert.encoder.layer.7.intermediate.dense.weight:
842
+ - 256
843
+ - 64
844
+ bert.encoder.layer.7.output.LayerNorm.bias:
845
+ - 64
846
+ bert.encoder.layer.7.output.LayerNorm.weight:
847
+ - 64
848
+ bert.encoder.layer.7.output.dense.bias:
849
+ - 64
850
+ bert.encoder.layer.7.output.dense.weight:
851
+ - 64
852
+ - 256
853
+ bert.encoder.layer.8.attention.output.dense.bias:
854
+ - 64
855
+ bert.encoder.layer.8.attention.output.dense.weight:
856
+ - 64
857
+ - 64
858
+ bert.encoder.layer.8.attention.output.norm.bias:
859
+ - 64
860
+ bert.encoder.layer.8.attention.output.norm.weight:
861
+ - 64
862
+ bert.encoder.layer.8.attention.self.key.bias:
863
+ - 64
864
+ bert.encoder.layer.8.attention.self.key.weight:
865
+ - 64
866
+ - 64
867
+ bert.encoder.layer.8.attention.self.query.bias:
868
+ - 64
869
+ bert.encoder.layer.8.attention.self.query.weight:
870
+ - 64
871
+ - 64
872
+ bert.encoder.layer.8.attention.self.value.bias:
873
+ - 64
874
+ bert.encoder.layer.8.attention.self.value.weight:
875
+ - 64
876
+ - 64
877
+ bert.encoder.layer.8.intermediate.dense.bias:
878
+ - 256
879
+ bert.encoder.layer.8.intermediate.dense.weight:
880
+ - 256
881
+ - 64
882
+ bert.encoder.layer.8.output.LayerNorm.bias:
883
+ - 64
884
+ bert.encoder.layer.8.output.LayerNorm.weight:
885
+ - 64
886
+ bert.encoder.layer.8.output.dense.bias:
887
+ - 64
888
+ bert.encoder.layer.8.output.dense.weight:
889
+ - 64
890
+ - 256
891
+ bert.encoder.layer.9.attention.output.dense.bias:
892
+ - 64
893
+ bert.encoder.layer.9.attention.output.dense.weight:
894
+ - 64
895
+ - 64
896
+ bert.encoder.layer.9.attention.output.norm.bias:
897
+ - 64
898
+ bert.encoder.layer.9.attention.output.norm.weight:
899
+ - 64
900
+ bert.encoder.layer.9.attention.self.key.bias:
901
+ - 64
902
+ bert.encoder.layer.9.attention.self.key.weight:
903
+ - 64
904
+ - 64
905
+ bert.encoder.layer.9.attention.self.query.bias:
906
+ - 64
907
+ bert.encoder.layer.9.attention.self.query.weight:
908
+ - 64
909
+ - 64
910
+ bert.encoder.layer.9.attention.self.value.bias:
911
+ - 64
912
+ bert.encoder.layer.9.attention.self.value.weight:
913
+ - 64
914
+ - 64
915
+ bert.encoder.layer.9.intermediate.dense.bias:
916
+ - 256
917
+ bert.encoder.layer.9.intermediate.dense.weight:
918
+ - 256
919
+ - 64
920
+ bert.encoder.layer.9.output.LayerNorm.bias:
921
+ - 64
922
+ bert.encoder.layer.9.output.LayerNorm.weight:
923
+ - 64
924
+ bert.encoder.layer.9.output.dense.bias:
925
+ - 64
926
+ bert.encoder.layer.9.output.dense.weight:
927
+ - 64
928
+ - 256
929
+ cls.predictions.bias:
930
+ - null
931
+ cls.predictions.transform.LayerNorm.bias:
932
+ - 64
933
+ cls.predictions.transform.LayerNorm.weight:
934
+ - 64
935
+ cls.predictions.transform.dense.bias:
936
+ - 64
937
+ cls.predictions.transform.dense.weight:
938
+ - 64
939
+ - 64