Update README.md (#1)

6b46ffb verified 5 days ago

5.78 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- generated_from_trainer
	datasets:
	- glue
	metrics:
	- accuracy
	model-index:
	- name: yujiepan/bert-base-uncased-sst2-int8-unstructured80
	results:
	- task:
	name: Text Classification
	type: text-classification
	dataset:
	name: GLUE SST2
	type: glue
	config: sst2
	split: validation
	args: sst2
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.91284
	base_model:
	- google-bert/bert-base-uncased
	base_model_relation: quantized
	---

	# bert-base-uncased-sst2-unstructured80-int8-ov

	* Model creator: [Google](https://huggingface.co/google-bert)
	* Original model: [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)

	## Description

	This model conducts unstructured magnitude pruning, quantization and distillation at the same time on [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) when finetuning on the GLUE SST2 dataset.
	It achieves the following results on the evaluation set:
	- Torch accuracy: 0.9128
	- OpenVINO IR accuracy: 0.9128
	- Sparsity in transformer block linear layers: 0.80

	The model was converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT8 by [NNCF](https://github.com/openvinotoolkit/nncf).

	## Compatibility

	The provided OpenVINO™ IR model is compatible with:

	* OpenVINO version 2024.3.0 and higher
	* Optimum Intel 1.19.0 and higher

	## Optimization Parameters

	Optimization was performed using `nncf` with the following `nncf_config.json` file:

	```
	[
	{
	"algorithm": "quantization",
	"preset": "mixed",
	"overflow_fix": "disable",
	"initializer": {
	"range": {
	"num_init_samples": 300,
	"type": "mean_min_max"
	},
	"batchnorm_adaptation": {
	"num_bn_adaptation_samples": 0
	}
	},
	"scope_overrides": {
	"activations": {
	"{re}.*matmul_0": {
	"mode": "symmetric"
	}
	}
	},
	"ignored_scopes": [
	"{re}.Embeddings.",
	"{re}.*__add___[0-1]",
	"{re}.*layer_norm_0",
	"{re}.*matmul_1",
	"{re}.__truediv__"
	]
	},
	{
	"algorithm": "magnitude_sparsity",
	"ignored_scopes": [
	"{re}.NNCFEmbedding.",
	"{re}.LayerNorm.",
	"{re}.pooler.",
	"{re}.classifier."
	],
	"sparsity_init": 0.0,
	"params": {
	"power": 3,
	"schedule": "polynomial",
	"sparsity_freeze_epoch": 10,
	"sparsity_target": 0.8,
	"sparsity_target_epoch": 9,
	"steps_per_epoch": 2105,
	"update_per_optimizer_step": true
	}
	}
	]
	```

	For more information on optimization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization.html).

	## Running Model Training

	1. Install required packages:

	```
	conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
	pip install optimum[openvino,nncf]
	pip install datasets sentencepiece scipy scikit-learn protobuf evaluate
	pip install wandb # optional
	```

	2. Run model training:

	```
	NNCFCFG=/path/to/nncf_config.json
	python run_glue.py \
	--lr_scheduler_type cosine_with_restarts \
	--cosine_lr_scheduler_cycles 11 6 \
	--record_best_model_after_epoch 9 \
	--load_best_model_at_end True \
	--metric_for_best_model accuracy \
	--model_name_or_path textattack/bert-base-uncased-SST-2 \
	--teacher_model_or_path yoshitomo-matsubara/bert-large-uncased-sst2 \
	--distillation_temperature 2 \
	--task_name sst2 \
	--nncf_compression_config $NNCFCFG \
	--distillation_weight 0.95 \
	--output_dir /tmp/bert-base-uncased-sst2-int8-unstructured80 \
	--overwrite_output_dir \
	--run_name bert-base-uncased-sst2-int8-unstructured80 \
	--do_train \
	--do_eval \
	--max_seq_length 128 \
	--per_device_train_batch_size 32 \
	--per_device_eval_batch_size 32 \
	--learning_rate 5e-05 \
	--optim adamw_torch \
	--num_train_epochs 17 \
	--logging_steps 1 \
	--evaluation_strategy steps \
	--eval_steps 250 \
	--save_strategy steps \
	--save_steps 250 \
	--save_total_limit 1 \
	--fp16 \
	--seed 1
	```

	For more details, refer to the [training configuration and script](https://gist.github.com/yujiepan-work/5d7e513a47b353db89f6e1b512d7c080).

	## Usage examples

	* [OpenVINO notebooks](https://github.com/openvinotoolkit/openvino_notebooks):
	- [Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/sparsity-optimization/sparsity-optimization.ipynb)

	## Limitations

	Check the original model card for [limitations](https://huggingface.co/google-bert/bert-base-uncased).

	## Legal information

	The original model is distributed under [apache-2.0](https://choosealicense.com/licenses/apache-2.0/) license. More details can be found in [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) model card.

	## Disclaimer

	Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See [Intel’s Global Human Rights Principles](https://www.intel.com/content/dam/www/central-libraries/us/en/documents/policy-human-rights.pdf). Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.