File size: 10,581 Bytes
41d4286
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
|     | layer_id                                                 | layer_type   | param_type   | shape      |   nparam |    nnz |   sparsity |
|----:|:---------------------------------------------------------|:-------------|:-------------|:-----------|---------:|-------:|-----------:|
|   5 | nncf_module.bert.encoder.layer.0.attention.self.query    | NNCFLinear   | weight       | [320, 768] |   245760 |  93583 |   0.61921  |
|   7 | nncf_module.bert.encoder.layer.0.attention.self.key      | NNCFLinear   | weight       | [320, 768] |   245760 |  98270 |   0.600138 |
|   9 | nncf_module.bert.encoder.layer.0.attention.self.value    | NNCFLinear   | weight       | [320, 768] |   245760 | 113605 |   0.53774  |
|  11 | nncf_module.bert.encoder.layer.0.attention.output.dense  | NNCFLinear   | weight       | [768, 320] |   245760 | 117208 |   0.523079 |
|  15 | nncf_module.bert.encoder.layer.0.intermediate.dense      | NNCFLinear   | weight       | [185, 768] |   142080 |  97073 |   0.316772 |
|  17 | nncf_module.bert.encoder.layer.0.output.dense            | NNCFLinear   | weight       | [768, 185] |   142080 |  94692 |   0.33353  |
|  21 | nncf_module.bert.encoder.layer.1.attention.self.query    | NNCFLinear   | weight       | [320, 768] |   245760 | 118436 |   0.518083 |
|  23 | nncf_module.bert.encoder.layer.1.attention.self.key      | NNCFLinear   | weight       | [320, 768] |   245760 | 118116 |   0.519385 |
|  25 | nncf_module.bert.encoder.layer.1.attention.self.value    | NNCFLinear   | weight       | [320, 768] |   245760 | 107511 |   0.562537 |
|  27 | nncf_module.bert.encoder.layer.1.attention.output.dense  | NNCFLinear   | weight       | [768, 320] |   245760 | 111189 |   0.547571 |
|  31 | nncf_module.bert.encoder.layer.1.intermediate.dense      | NNCFLinear   | weight       | [315, 768] |   241920 | 148783 |   0.384991 |
|  33 | nncf_module.bert.encoder.layer.1.output.dense            | NNCFLinear   | weight       | [768, 315] |   241920 | 143166 |   0.408209 |
|  37 | nncf_module.bert.encoder.layer.2.attention.self.query    | NNCFLinear   | weight       | [576, 768] |   442368 | 162735 |   0.632128 |
|  39 | nncf_module.bert.encoder.layer.2.attention.self.key      | NNCFLinear   | weight       | [576, 768] |   442368 | 164795 |   0.627471 |
|  41 | nncf_module.bert.encoder.layer.2.attention.self.value    | NNCFLinear   | weight       | [576, 768] |   442368 | 135670 |   0.69331  |
|  43 | nncf_module.bert.encoder.layer.2.attention.output.dense  | NNCFLinear   | weight       | [768, 576] |   442368 | 138445 |   0.687037 |
|  47 | nncf_module.bert.encoder.layer.2.intermediate.dense      | NNCFLinear   | weight       | [339, 768] |   260352 | 154035 |   0.408359 |
|  49 | nncf_module.bert.encoder.layer.2.output.dense            | NNCFLinear   | weight       | [768, 339] |   260352 | 150816 |   0.420723 |
|  53 | nncf_module.bert.encoder.layer.3.attention.self.query    | NNCFLinear   | weight       | [576, 768] |   442368 | 170623 |   0.614296 |
|  55 | nncf_module.bert.encoder.layer.3.attention.self.key      | NNCFLinear   | weight       | [576, 768] |   442368 | 178401 |   0.596714 |
|  57 | nncf_module.bert.encoder.layer.3.attention.self.value    | NNCFLinear   | weight       | [576, 768] |   442368 | 171905 |   0.611398 |
|  59 | nncf_module.bert.encoder.layer.3.attention.output.dense  | NNCFLinear   | weight       | [768, 576] |   442368 | 169172 |   0.617576 |
|  63 | nncf_module.bert.encoder.layer.3.intermediate.dense      | NNCFLinear   | weight       | [368, 768] |   282624 | 163163 |   0.422685 |
|  65 | nncf_module.bert.encoder.layer.3.output.dense            | NNCFLinear   | weight       | [768, 368] |   282624 | 157506 |   0.442701 |
|  69 | nncf_module.bert.encoder.layer.4.attention.self.query    | NNCFLinear   | weight       | [576, 768] |   442368 | 175772 |   0.602657 |
|  71 | nncf_module.bert.encoder.layer.4.attention.self.key      | NNCFLinear   | weight       | [576, 768] |   442368 | 177087 |   0.599684 |
|  73 | nncf_module.bert.encoder.layer.4.attention.self.value    | NNCFLinear   | weight       | [576, 768] |   442368 | 163996 |   0.629277 |
|  75 | nncf_module.bert.encoder.layer.4.attention.output.dense  | NNCFLinear   | weight       | [768, 576] |   442368 | 159335 |   0.639813 |
|  79 | nncf_module.bert.encoder.layer.4.intermediate.dense      | NNCFLinear   | weight       | [386, 768] |   296448 | 167726 |   0.434214 |
|  81 | nncf_module.bert.encoder.layer.4.output.dense            | NNCFLinear   | weight       | [768, 386] |   296448 | 159865 |   0.460732 |
|  85 | nncf_module.bert.encoder.layer.5.attention.self.query    | NNCFLinear   | weight       | [384, 768] |   294912 | 114186 |   0.612813 |
|  87 | nncf_module.bert.encoder.layer.5.attention.self.key      | NNCFLinear   | weight       | [384, 768] |   294912 | 132782 |   0.549757 |
|  89 | nncf_module.bert.encoder.layer.5.attention.self.value    | NNCFLinear   | weight       | [384, 768] |   294912 | 134830 |   0.542813 |
|  91 | nncf_module.bert.encoder.layer.5.attention.output.dense  | NNCFLinear   | weight       | [768, 384] |   294912 | 131941 |   0.552609 |
|  95 | nncf_module.bert.encoder.layer.5.intermediate.dense      | NNCFLinear   | weight       | [336, 768] |   258048 | 153916 |   0.403537 |
|  97 | nncf_module.bert.encoder.layer.5.output.dense            | NNCFLinear   | weight       | [768, 336] |   258048 | 145794 |   0.435012 |
| 101 | nncf_module.bert.encoder.layer.6.attention.self.query    | NNCFLinear   | weight       | [448, 768] |   344064 | 131878 |   0.616705 |
| 103 | nncf_module.bert.encoder.layer.6.attention.self.key      | NNCFLinear   | weight       | [448, 768] |   344064 | 144502 |   0.580014 |
| 105 | nncf_module.bert.encoder.layer.6.attention.self.value    | NNCFLinear   | weight       | [448, 768] |   344064 | 130911 |   0.619516 |
| 107 | nncf_module.bert.encoder.layer.6.attention.output.dense  | NNCFLinear   | weight       | [768, 448] |   344064 | 125928 |   0.633998 |
| 111 | nncf_module.bert.encoder.layer.6.intermediate.dense      | NNCFLinear   | weight       | [280, 768] |   215040 | 135283 |   0.370894 |
| 113 | nncf_module.bert.encoder.layer.6.output.dense            | NNCFLinear   | weight       | [768, 280] |   215040 | 131619 |   0.387932 |
| 117 | nncf_module.bert.encoder.layer.7.attention.self.query    | NNCFLinear   | weight       | [448, 768] |   344064 | 132120 |   0.616002 |
| 119 | nncf_module.bert.encoder.layer.7.attention.self.key      | NNCFLinear   | weight       | [448, 768] |   344064 | 152223 |   0.557574 |
| 121 | nncf_module.bert.encoder.layer.7.attention.self.value    | NNCFLinear   | weight       | [448, 768] |   344064 | 141066 |   0.590001 |
| 123 | nncf_module.bert.encoder.layer.7.attention.output.dense  | NNCFLinear   | weight       | [768, 448] |   344064 | 135662 |   0.605707 |
| 127 | nncf_module.bert.encoder.layer.7.intermediate.dense      | NNCFLinear   | weight       | [211, 768] |   162048 | 109590 |   0.323719 |
| 129 | nncf_module.bert.encoder.layer.7.output.dense            | NNCFLinear   | weight       | [768, 211] |   162048 | 107335 |   0.337635 |
| 133 | nncf_module.bert.encoder.layer.8.attention.self.query    | NNCFLinear   | weight       | [448, 768] |   344064 | 129148 |   0.62464  |
| 135 | nncf_module.bert.encoder.layer.8.attention.self.key      | NNCFLinear   | weight       | [448, 768] |   344064 | 130060 |   0.621989 |
| 137 | nncf_module.bert.encoder.layer.8.attention.self.value    | NNCFLinear   | weight       | [448, 768] |   344064 | 108162 |   0.685634 |
| 139 | nncf_module.bert.encoder.layer.8.attention.output.dense  | NNCFLinear   | weight       | [768, 448] |   344064 | 103447 |   0.699338 |
| 143 | nncf_module.bert.encoder.layer.8.intermediate.dense      | NNCFLinear   | weight       | [108, 768] |    82944 |  63275 |   0.237136 |
| 145 | nncf_module.bert.encoder.layer.8.output.dense            | NNCFLinear   | weight       | [768, 108] |    82944 |  62725 |   0.243767 |
| 149 | nncf_module.bert.encoder.layer.9.attention.self.query    | NNCFLinear   | weight       | [320, 768] |   245760 | 107145 |   0.564026 |
| 151 | nncf_module.bert.encoder.layer.9.attention.self.key      | NNCFLinear   | weight       | [320, 768] |   245760 | 101811 |   0.58573  |
| 153 | nncf_module.bert.encoder.layer.9.attention.self.value    | NNCFLinear   | weight       | [320, 768] |   245760 |  52182 |   0.787671 |
| 155 | nncf_module.bert.encoder.layer.9.attention.output.dense  | NNCFLinear   | weight       | [768, 320] |   245760 |  53210 |   0.783488 |
| 159 | nncf_module.bert.encoder.layer.9.intermediate.dense      | NNCFLinear   | weight       | [53, 768]  |    40704 |  33461 |   0.177943 |
| 161 | nncf_module.bert.encoder.layer.9.output.dense            | NNCFLinear   | weight       | [768, 53]  |    40704 |  32551 |   0.2003   |
| 165 | nncf_module.bert.encoder.layer.10.attention.self.query   | NNCFLinear   | weight       | [384, 768] |   294912 | 112430 |   0.618768 |
| 167 | nncf_module.bert.encoder.layer.10.attention.self.key     | NNCFLinear   | weight       | [384, 768] |   294912 | 109594 |   0.628384 |
| 169 | nncf_module.bert.encoder.layer.10.attention.self.value   | NNCFLinear   | weight       | [384, 768] |   294912 |  61774 |   0.790534 |
| 171 | nncf_module.bert.encoder.layer.10.attention.output.dense | NNCFLinear   | weight       | [768, 384] |   294912 |  64183 |   0.782366 |
| 175 | nncf_module.bert.encoder.layer.10.intermediate.dense     | NNCFLinear   | weight       | [86, 768]  |    66048 |  50455 |   0.236086 |
| 177 | nncf_module.bert.encoder.layer.10.output.dense           | NNCFLinear   | weight       | [768, 86]  |    66048 |  49741 |   0.246896 |
| 181 | nncf_module.bert.encoder.layer.11.attention.self.query   | NNCFLinear   | weight       | [384, 768] |   294912 |  88129 |   0.701168 |
| 183 | nncf_module.bert.encoder.layer.11.attention.self.key     | NNCFLinear   | weight       | [384, 768] |   294912 |  85288 |   0.710802 |
| 185 | nncf_module.bert.encoder.layer.11.attention.self.value   | NNCFLinear   | weight       | [384, 768] |   294912 |  47258 |   0.839756 |
| 187 | nncf_module.bert.encoder.layer.11.attention.output.dense | NNCFLinear   | weight       | [768, 384] |   294912 |  49311 |   0.832794 |
| 191 | nncf_module.bert.encoder.layer.11.intermediate.dense     | NNCFLinear   | weight       | [105, 768] |    80640 |  62254 |   0.228001 |
| 193 | nncf_module.bert.encoder.layer.11.output.dense           | NNCFLinear   | weight       | [768, 105] |    80640 |  61669 |   0.235255 |