xinhe commited on
Commit
ad58783
1 Parent(s): 78c21db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -8
README.md CHANGED
@@ -8,23 +8,33 @@ tags:
8
  datasets:
9
  - sst2
10
  metrics:
11
- - f1
12
  ---
13
 
14
- INT8 DistilBERT base uncased finetuned SST-2 (Post-training static quantization)
15
- ===
16
- This is an INT8 PyTorch model quantized by [intel/nlp-toolkit](https://github.com/intel/nlp-toolkit) using provider: [Intel® Neural Compressor](https://github.com/intel/neural-compressor). The original fp32 model comes from the fine-tuned model [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)
17
 
18
- Test result below comes from [Amazon Web Services](https://aws.amazon.com/) c6i.xlarge (intel ice lake: 4 vCPUs, 8g Memory) instance.
19
 
20
- | |int8|fp32|
 
 
 
 
 
 
 
 
 
 
 
 
21
  |---|:---:|:---:|
22
  | **Throughput (samples/sec)** |47.554|23.046|
23
- | **Accuracy(f1-score)** |0.9037|0.9106|
24
  | **Model size (MB)** |66|255|
25
 
26
 
27
- Load with nlp-toolkit:
28
  ```python
29
  from nlp_toolkit import OptimizedModel
30
  int8_model = OptimizedModel.from_pretrained(
 
8
  datasets:
9
  - sst2
10
  metrics:
11
+ - accuracy
12
  ---
13
 
14
+ # INT8 DistilBERT base uncased finetuned SST-2
 
 
15
 
16
+ ### Post-training static quantization
17
 
18
+ This is an INT8 PyTorch model quantified with [intel/nlp-toolkit](https://github.com/intel/nlp-toolkit) using provider: [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
19
+
20
+ The original fp32 model comes from the fine-tuned model [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)
21
+
22
+ The calibration dataloader is the train dataloader. The default calibration sampling size 100 isn't divisible exactly by batch size 8, so
23
+ the real sampling size is 104.
24
+
25
+ ### Test result
26
+
27
+ - Batch size = 8
28
+ - [Amazon Web Services](https://aws.amazon.com/) c6i.xlarge (Intel ICE Lake: 4 vCPUs, 8g Memory) instance.
29
+
30
+ | |INT8|FP32|
31
  |---|:---:|:---:|
32
  | **Throughput (samples/sec)** |47.554|23.046|
33
+ | **Accuracy (eval-accuracy)** |0.9037|0.9106|
34
  | **Model size (MB)** |66|255|
35
 
36
 
37
+ ### Load with nlp-toolkit:
38
  ```python
39
  from nlp_toolkit import OptimizedModel
40
  int8_model = OptimizedModel.from_pretrained(