File size: 1,253 Bytes
06bf9e5
 
 
 
 
 
 
124681e
877ce3c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
### BERT-base compressed by JPQD with Regularization Factor 0.03
```
F1: 87.66
EM: 80.23 

```

### Description of important files
```
β”œβ”€β”€ r0.030-squad-bert-b-mvmt-8bit
β”‚Β Β  β”œβ”€β”€ 8bit_ref_bert_squad_nncf_mvmt.json (nncf config used with ssbs-feb branch)
β”‚Β Β  β”œβ”€β”€ checkpoint-110000 (trained checkpoint for generation)
β”‚Β Β  β”œβ”€β”€ ir
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ sparsity_structures.csv
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ sparsity_structures.md  (layer wise sparsity reporting, for linear layer in transformer block only)
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ sparsity_structures.pkl (containing pruned structure id, e.g. particular head in MHSA or dimension in FFN, useful for debug)
β”‚Β Β  β”‚Β Β  └── squad-BertForQuestionAnswering.cropped.8bit.xml (custom discard of pruned dimension and onnx export, followed by ir translation
β”‚Β Β  β”œβ”€β”€ ir_uncropped
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ mo-pruned-ir
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ mo.log (see Model Optimizer version here)
β”‚Β Β  β”‚Β Β  β”‚Β Β  └── squad-BertForQuestionAnswering.8bit.xml (pruned structures are removed using Model Optimier --transform=Pruning)
β”‚Β Β  β”‚Β Β  └── squad-BertForQuestionAnswering.8bit.xml (pruned structures are sparsified/zero-ed only)
```