|
### BERT-base compressed by JPQD with Regularization Factor 0.03 |
|
``` |
|
F1: 87.66 |
|
EM: 80.23 |
|
|
|
``` |
|
|
|
### Description of important files |
|
``` |
|
βββ r0.030-squad-bert-b-mvmt-8bit |
|
βΒ Β βββ 8bit_ref_bert_squad_nncf_mvmt.json (nncf config used with ssbs-feb branch) |
|
βΒ Β βββ checkpoint-110000 (trained checkpoint for generation) |
|
βΒ Β βββ ir |
|
βΒ Β βΒ Β βββ sparsity_structures.csv |
|
βΒ Β βΒ Β βββ sparsity_structures.md (layer wise sparsity reporting, for linear layer in transformer block only) |
|
βΒ Β βΒ Β βββ sparsity_structures.pkl (containing pruned structure id, e.g. particular head in MHSA or dimension in FFN, useful for debug) |
|
βΒ Β βΒ Β βββ squad-BertForQuestionAnswering.cropped.8bit.xml (custom discard of pruned dimension and onnx export, followed by ir translation |
|
βΒ Β βββ ir_uncropped |
|
βΒ Β βΒ Β βββ mo-pruned-ir |
|
βΒ Β βΒ Β βΒ Β βββ mo.log (see Model Optimizer version here) |
|
βΒ Β βΒ Β βΒ Β βββ squad-BertForQuestionAnswering.8bit.xml (pruned structures are removed using Model Optimier --transform=Pruning) |
|
βΒ Β βΒ Β βββ squad-BertForQuestionAnswering.8bit.xml (pruned structures are sparsified/zero-ed only) |
|
``` |
|
|