Chua, Vui Seng
commited on
Commit
•
fae2108
1
Parent(s):
5b8a717
UPdate readme
Browse files
README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1 |
This model is a downstream optimization of [```vuiseng9/bert-base-squadv1-block-pruning-hybrid-filled-lt```](https://huggingface.co/vuiseng9/bert-base-squadv1-block-pruning-hybrid-filled-lt) using [OpenVINO/NNCF](https://github.com/openvinotoolkit/nncf). Applied optimization includes:
|
2 |
-
1. magnitude sparsification
|
3 |
-
2. NNCF Quantize-Aware Training
|
4 |
3. Custom distillation with large model ```bert-large-uncased-whole-word-masking-finetuned-squad```
|
5 |
|
6 |
```
|
@@ -84,7 +84,6 @@ python run_qa.py \
|
|
84 |
```
|
85 |
|
86 |
# Eval
|
87 |
-
|
88 |
This repo must be cloned locally.
|
89 |
```bash
|
90 |
git clone https://huggingface.co/vuiseng9/bert-base-squadv1-block-pruning-hybrid-filled-lt-nncf-50.0sparse-qat-lt
|
|
|
1 |
This model is a downstream optimization of [```vuiseng9/bert-base-squadv1-block-pruning-hybrid-filled-lt```](https://huggingface.co/vuiseng9/bert-base-squadv1-block-pruning-hybrid-filled-lt) using [OpenVINO/NNCF](https://github.com/openvinotoolkit/nncf). Applied optimization includes:
|
2 |
+
1. magnitude sparsification at 50% upon initialization. Parameters are ranked globally via thier absolute norm. Only linear layers of self-attention and ffnn are targeted.
|
3 |
+
2. NNCF Quantize-Aware Training - Symmetric 8-bit for both weight and activation on all learnable layers.
|
4 |
3. Custom distillation with large model ```bert-large-uncased-whole-word-masking-finetuned-squad```
|
5 |
|
6 |
```
|
|
|
84 |
```
|
85 |
|
86 |
# Eval
|
|
|
87 |
This repo must be cloned locally.
|
88 |
```bash
|
89 |
git clone https://huggingface.co/vuiseng9/bert-base-squadv1-block-pruning-hybrid-filled-lt-nncf-50.0sparse-qat-lt
|