File size: 1,547 Bytes
76ed391
1ffada6
 
d1b477b
1ffada6
55df0c9
1ffada6
127e325
1ffada6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76ed391
1ffada6
 
 
 
510bdce
1ffada6
 
 
 
 
 
 
 
 
75634b2
1ffada6
 
510bdce
1ffada6
 
45fbd5b
 
 
 
1ffada6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
language:
- en
license: apache-2.0
tags:
- token-classfication
- int8
- Intel® Neural Compressor
- PostTrainingStatic
datasets:
- conll2003
metrics:
- accuracy
model-index:
- name: distilbert-base-uncased-finetuned-conll03-english-int8-static
  results:
  - task:
      name: Token Classification
      type: token-classification
    dataset:
      name: Conll2003
      type: conll2003
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.9858650364082395
---
# INT8 distilbert-base-uncased-finetuned-conll03-english

###  Post-training static quantization

This is an INT8  PyTorch model quantized with [huggingface/optimum-intel](https://github.com/huggingface/optimum-intel) through the usage of [Intel® Neural Compressor](https://github.com/intel/neural-compressor). 

The original fp32 model comes from the fine-tuned model [elastic/distilbert-base-uncased-finetuned-conll03-english](https://huggingface.co/elastic/distilbert-base-uncased-finetuned-conll03-english).

The calibration dataloader is the train dataloader. The default calibration sampling size 100 isn't divisible exactly by batch size 8, so the real sampling size is 104.

### Test result

|   |INT8|FP32|
|---|:---:|:---:|
| **Accuracy (eval-accuracy)** |0.9859|0.9882|
| **Model size (MB)**  |64.5|253|

### Load with optimum:

```python
from optimum.intel import INCModelForTokenClassification

model_id = "Intel/distilbert-base-uncased-finetuned-conll03-english-int8-static"
int8_model = INCModelForTokenClassification.from_pretrained(model_id)
```