Intel
/

distilbert-base-uncased-finetuned-sst-2-english-int8-static-inc

Text Classification

text-classfication

neural-compressor

Intel® Neural Compressor

PostTrainingStatic

Inference Endpoints

Model card Files Files and versions Community

xinhe commited on Apr 11, 2022

Commit

fcf0731

•

1 Parent(s): 885f6cd

Update README.md

Files changed (1) hide show

README.md +9 -9

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 language: en
 license: apache-2.0
-tags: text-classfication
 datasets:
 - sst2
 ---
@@ -10,21 +10,21 @@ INT8 DistilBERT base uncased finetuned SST-2 (Post-training static quantization)
 ===
 This is an INT8  PyTorch model quantized by [intel/nlp-toolkit](https://github.com/intel/nlp-toolkit) using provider: [Intel® Neural Compressor](https://github.com/intel/neural-compressor). The original fp32 model comes from the fine-tuned model [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)
-Test result below comes from [AWS](https://aws.amazon.com/) c6i.xlarge (intel ice lake: 4 vCPUs, 8g Memory) instance.
-|   |fp32|int8|
 |---|:---:|:---:|
-| **Accuracy** |0.9106|0.9037|
-| **Throughput (samples/sec)**  |？|？|
-| **Model size (MB)**  |255|66|
-Load with optimum:
 ```python
 from nlp_toolkit import OptimizedModel
 int8_model = OptimizedModel.from_pretrained(
-    'intel/distilbert-base-uncased-finetuned-sst-2-english-int8-static',
 )
 ```
 Notes:
- - The INT8 model has better performance than the FP32 model when the CPU is fully loaded. Otherwise, there will be the illusion that INT8 is inferior to FP32.

 ---
 language: en
 license: apache-2.0
+tags: text-classfication, int8, PostTrainingStatic
 datasets:
 - sst2
 ---
 ===
 This is an INT8  PyTorch model quantized by [intel/nlp-toolkit](https://github.com/intel/nlp-toolkit) using provider: [Intel® Neural Compressor](https://github.com/intel/neural-compressor). The original fp32 model comes from the fine-tuned model [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)
+Test result below comes from [Amazon Web Services](https://aws.amazon.com/) c6i.xlarge (intel ice lake: 4 vCPUs, 8g Memory) instance.
+|   |int8|fp32|
 |---|:---:|:---:|
+| **Throughput (samples/sec)**  |47.554|23.046|
+| **Accuracy(f1-score)** |0.9037|0.9106|
+| **Model size (MB)**  |66|255|
+Load with nlp-toolkit:
 ```python
 from nlp_toolkit import OptimizedModel
 int8_model = OptimizedModel.from_pretrained(
+    'Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-static',
 )
 ```
 Notes:
+ - The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.