File size: 1,911 Bytes
296f7bb 78c21db 8e70cc9 78c21db 296f7bb 78c21db ad58783 296f7bb ad58783 296f7bb ac67851 296f7bb b71abb6 ad58783 a96c305 ad58783 ac67851 ad58783 296f7bb ad58783 d0e482a 296f7bb ac67851 a96c305 296f7bb b71abb6 fcf0731 296f7bb ac67851 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
---
language: en
license: apache-2.0
tags:
- text-classfication
- int8
- Intel® Neural Compressor
- PostTrainingStatic
datasets:
- sst2
metrics:
- accuracy
---
# INT8 DistilBERT base uncased finetuned SST-2
## Post-training static quantization
### PyTorch
This is an INT8 PyTorch model quantized with [huggingface/optimum-intel](https://github.com/huggingface/optimum-intel) through the usage of [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
The original fp32 model comes from the fine-tuned model [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
The calibration dataloader is the train dataloader. The default calibration sampling size 100 isn't divisible exactly by batch size 8, so
the real sampling size is 104.
#### Test result
| |INT8|FP32|
|---|:---:|:---:|
| **Accuracy (eval-accuracy)** |0.9037|0.9106|
| **Model size (MB)** |65|255|
#### Load with optimum:
```python
from optimum.intel.neural_compressor.quantization import IncQuantizedModelForSequenceClassification
int8_model = IncQuantizedModelForSequenceClassification.from_pretrained(
'Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-static',
)
```
### ONNX
This is an INT8 ONNX model quantized with [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
The original fp32 model comes from the fine-tuned model [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
#### Test result
| |INT8|FP32|
|---|:---:|:---:|
| **Accuracy (eval-f1)** |0.9060|0.9106|
| **Model size (MB)** |80|256|
#### Load ONNX model:
```python
from optimum.onnxruntime import ORTModelForSequenceClassification
model = ORTModelForSequenceClassification.from_pretrained('Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-static')
``` |