bert-base-uncased-sst2-unstructured80-int8-ov
- Model creator: Google
- Original model: google-bert/bert-base-uncased
Description
This model conducts unstructured magnitude pruning, quantization and distillation at the same time on google-bert/bert-base-uncased when finetuning on the GLUE SST2 dataset. It achieves the following results on the evaluation set:
- Torch accuracy: 0.9128
- OpenVINO IR accuracy: 0.9128
- Sparsity in transformer block linear layers: 0.80
The model was converted to the OpenVINO™ IR (Intermediate Representation) format with weights compressed to INT8 by NNCF.
Compatibility
The provided OpenVINO™ IR model is compatible with:
- OpenVINO version 2024.3.0 and higher
- Optimum Intel 1.19.0 and higher
Optimization Parameters
Optimization was performed using nncf
with the following nncf_config.json
file:
[
{
"algorithm": "quantization",
"preset": "mixed",
"overflow_fix": "disable",
"initializer": {
"range": {
"num_init_samples": 300,
"type": "mean_min_max"
},
"batchnorm_adaptation": {
"num_bn_adaptation_samples": 0
}
},
"scope_overrides": {
"activations": {
"{re}.*matmul_0": {
"mode": "symmetric"
}
}
},
"ignored_scopes": [
"{re}.*Embeddings.*",
"{re}.*__add___[0-1]",
"{re}.*layer_norm_0",
"{re}.*matmul_1",
"{re}.*__truediv__*"
]
},
{
"algorithm": "magnitude_sparsity",
"ignored_scopes": [
"{re}.*NNCFEmbedding.*",
"{re}.*LayerNorm.*",
"{re}.*pooler.*",
"{re}.*classifier.*"
],
"sparsity_init": 0.0,
"params": {
"power": 3,
"schedule": "polynomial",
"sparsity_freeze_epoch": 10,
"sparsity_target": 0.8,
"sparsity_target_epoch": 9,
"steps_per_epoch": 2105,
"update_per_optimizer_step": true
}
}
]
For more information on optimization, check the OpenVINO model optimization guide.
Running Model Training
- Install required packages:
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
pip install optimum[openvino,nncf]
pip install datasets sentencepiece scipy scikit-learn protobuf evaluate
pip install wandb # optional
- Run model training:
NNCFCFG=/path/to/nncf_config.json
python run_glue.py \
--lr_scheduler_type cosine_with_restarts \
--cosine_lr_scheduler_cycles 11 6 \
--record_best_model_after_epoch 9 \
--load_best_model_at_end True \
--metric_for_best_model accuracy \
--model_name_or_path textattack/bert-base-uncased-SST-2 \
--teacher_model_or_path yoshitomo-matsubara/bert-large-uncased-sst2 \
--distillation_temperature 2 \
--task_name sst2 \
--nncf_compression_config $NNCFCFG \
--distillation_weight 0.95 \
--output_dir /tmp/bert-base-uncased-sst2-int8-unstructured80 \
--overwrite_output_dir \
--run_name bert-base-uncased-sst2-int8-unstructured80 \
--do_train \
--do_eval \
--max_seq_length 128 \
--per_device_train_batch_size 32 \
--per_device_eval_batch_size 32 \
--learning_rate 5e-05 \
--optim adamw_torch \
--num_train_epochs 17 \
--logging_steps 1 \
--evaluation_strategy steps \
--eval_steps 250 \
--save_strategy steps \
--save_steps 250 \
--save_total_limit 1 \
--fp16 \
--seed 1
For more details, refer to the training configuration and script.
Usage examples
Limitations
Check the original model card for limitations.
Legal information
The original model is distributed under apache-2.0 license. More details can be found in google-bert/bert-base-uncased model card.
Disclaimer
Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
- Downloads last month
- 8,099
Model tree for OpenVINO/bert-base-uncased-sst2-unstructured80-int8-ov
Base model
google-bert/bert-base-uncasedDataset used to train OpenVINO/bert-base-uncased-sst2-unstructured80-int8-ov
Evaluation results
- Accuracy on GLUE SST2validation set self-reported0.913