File size: 5,777 Bytes
3841ad3 6d32b9d 3841ad3 e01ab27 3841ad3 6b46ffb 3841ad3 6d32b9d 3841ad3 6d32b9d 3841ad3 6d32b9d 3841ad3 6d32b9d 00e9da9 6d32b9d 3841ad3 6d32b9d 3841ad3 6d32b9d 3841ad3 6d32b9d 3841ad3 6d32b9d 3841ad3 6d32b9d 3841ad3 6d32b9d 3841ad3 eed0e15 3841ad3 6d32b9d 3841ad3 6d32b9d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
---
license: apache-2.0
language:
- en
tags:
- generated_from_trainer
datasets:
- glue
metrics:
- accuracy
model-index:
- name: yujiepan/bert-base-uncased-sst2-int8-unstructured80
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: GLUE SST2
type: glue
config: sst2
split: validation
args: sst2
metrics:
- name: Accuracy
type: accuracy
value: 0.91284
base_model:
- google-bert/bert-base-uncased
base_model_relation: quantized
---
# bert-base-uncased-sst2-unstructured80-int8-ov
* Model creator: [Google](https://huggingface.co/google-bert)
* Original model: [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)
## Description
This model conducts unstructured magnitude pruning, quantization and distillation at the same time on [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) when finetuning on the GLUE SST2 dataset.
It achieves the following results on the evaluation set:
- Torch accuracy: **0.9128**
- OpenVINO IR accuracy: **0.9128**
- Sparsity in transformer block linear layers: **0.80**
The model was converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT8 by [NNCF](https://github.com/openvinotoolkit/nncf).
## Compatibility
The provided OpenVINO™ IR model is compatible with:
* OpenVINO version 2024.3.0 and higher
* Optimum Intel 1.19.0 and higher
## Optimization Parameters
Optimization was performed using `nncf` with the following `nncf_config.json` file:
```
[
{
"algorithm": "quantization",
"preset": "mixed",
"overflow_fix": "disable",
"initializer": {
"range": {
"num_init_samples": 300,
"type": "mean_min_max"
},
"batchnorm_adaptation": {
"num_bn_adaptation_samples": 0
}
},
"scope_overrides": {
"activations": {
"{re}.*matmul_0": {
"mode": "symmetric"
}
}
},
"ignored_scopes": [
"{re}.*Embeddings.*",
"{re}.*__add___[0-1]",
"{re}.*layer_norm_0",
"{re}.*matmul_1",
"{re}.*__truediv__*"
]
},
{
"algorithm": "magnitude_sparsity",
"ignored_scopes": [
"{re}.*NNCFEmbedding.*",
"{re}.*LayerNorm.*",
"{re}.*pooler.*",
"{re}.*classifier.*"
],
"sparsity_init": 0.0,
"params": {
"power": 3,
"schedule": "polynomial",
"sparsity_freeze_epoch": 10,
"sparsity_target": 0.8,
"sparsity_target_epoch": 9,
"steps_per_epoch": 2105,
"update_per_optimizer_step": true
}
}
]
```
For more information on optimization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization.html).
## Running Model Training
1. Install required packages:
```
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
pip install optimum[openvino,nncf]
pip install datasets sentencepiece scipy scikit-learn protobuf evaluate
pip install wandb # optional
```
2. Run model training:
```
NNCFCFG=/path/to/nncf_config.json
python run_glue.py \
--lr_scheduler_type cosine_with_restarts \
--cosine_lr_scheduler_cycles 11 6 \
--record_best_model_after_epoch 9 \
--load_best_model_at_end True \
--metric_for_best_model accuracy \
--model_name_or_path textattack/bert-base-uncased-SST-2 \
--teacher_model_or_path yoshitomo-matsubara/bert-large-uncased-sst2 \
--distillation_temperature 2 \
--task_name sst2 \
--nncf_compression_config $NNCFCFG \
--distillation_weight 0.95 \
--output_dir /tmp/bert-base-uncased-sst2-int8-unstructured80 \
--overwrite_output_dir \
--run_name bert-base-uncased-sst2-int8-unstructured80 \
--do_train \
--do_eval \
--max_seq_length 128 \
--per_device_train_batch_size 32 \
--per_device_eval_batch_size 32 \
--learning_rate 5e-05 \
--optim adamw_torch \
--num_train_epochs 17 \
--logging_steps 1 \
--evaluation_strategy steps \
--eval_steps 250 \
--save_strategy steps \
--save_steps 250 \
--save_total_limit 1 \
--fp16 \
--seed 1
```
For more details, refer to the [training configuration and script](https://gist.github.com/yujiepan-work/5d7e513a47b353db89f6e1b512d7c080).
## Usage examples
* [OpenVINO notebooks](https://github.com/openvinotoolkit/openvino_notebooks):
- [Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/sparsity-optimization/sparsity-optimization.ipynb)
## Limitations
Check the original model card for [limitations](https://huggingface.co/google-bert/bert-base-uncased).
## Legal information
The original model is distributed under [apache-2.0](https://choosealicense.com/licenses/apache-2.0/) license. More details can be found in [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) model card.
## Disclaimer
Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See [Intel’s Global Human Rights Principles](https://www.intel.com/content/dam/www/central-libraries/us/en/documents/policy-human-rights.pdf). Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights. |