yuwenz commited on
Commit
4f2c88b
1 Parent(s): 59314e2

upload int8 onnx model

Browse files

Signed-off-by: yuwenzho <yuwen.zhou@intel.com>

Files changed (2) hide show
  1. README.md +23 -0
  2. model.onnx +3 -0
README.md CHANGED
@@ -9,6 +9,7 @@ metrics:
9
  tags:
10
  - text-classfication
11
  - int8
 
12
  ---
13
 
14
  # Dynamically quantized DistilBERT base uncased finetuned SST-2
@@ -26,6 +27,8 @@ tags:
26
 
27
  ## How to Get Started With the Model
28
 
 
 
29
  To load the quantized model, you can do as follows:
30
 
31
  ```python
@@ -33,3 +36,23 @@ from optimum.intel.neural_compressor.quantization import IncQuantizedModelForSeq
33
 
34
  model = IncQuantizedModelForSequenceClassification.from_pretrained("Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-dynamic")
35
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  tags:
10
  - text-classfication
11
  - int8
12
+ - onnx
13
  ---
14
 
15
  # Dynamically quantized DistilBERT base uncased finetuned SST-2
27
 
28
  ## How to Get Started With the Model
29
 
30
+ ### PyTorch
31
+
32
  To load the quantized model, you can do as follows:
33
 
34
  ```python
36
 
37
  model = IncQuantizedModelForSequenceClassification.from_pretrained("Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-dynamic")
38
  ```
39
+
40
+ ### ONNX
41
+
42
+ This is an INT8 ONNX model quantized with [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
43
+
44
+ The original fp32 model comes from the fine-tuned model [DistilBERT](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
45
+
46
+ #### Test result
47
+
48
+ | |INT8|FP32|
49
+ |---|:---:|:---:|
50
+ | **Accuracy (eval-f1)** |0.9037|0.9106|
51
+ | **Model size (MB)** |73|256|
52
+
53
+ #### Load ONNX model:
54
+
55
+ ```python
56
+ from optimum.onnxruntime import ORTModelForSequenceClassification
57
+ model = ORTModelForSequenceClassification.from_pretrained('Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-dynamic')
58
+ ```
model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ffaa5bd531a044237ee88f08e97dcae85bb121806fd1a5e7c556a13927343ad4
3
+ size 76104966