Intel
/

bert-base-uncased-mrpc-int8-qat-inc

Text Classification

text-classfication

Intel® Neural Compressor

QuantizationAwareTraining

Inference Endpoints

Model card Files Files and versions Community

xinhe commited on Apr 11, 2022

Commit

72920eb

•

1 Parent(s): b162079

Update README.md

Files changed (1) hide show

README.md +6 -2

README.md CHANGED Viewed

@@ -15,7 +15,9 @@ metrics:
 ### QuantizationAwareTraining
-This is an INT8  PyTorch model quantized by [intel/nlp-toolkit](https://github.com/intel/nlp-toolkit) using provider: [Intel® Neural Compressor](https://github.com/intel/neural-compressor). The original fp32 model comes from the fine-tuned model [Intel/bert-base-uncased-mrpc](https://huggingface.co/Intel/bert-base-uncased-mrpc)
 #### Training hyperparameters
@@ -44,11 +46,13 @@ The following hyperparameters were used during training:
 | **Model size (MB)**  |174|418|
 ### Load with nlp-toolkit:
 ```python
 from nlp_toolkit import OptimizedModel
 int8_model = OptimizedModel.from_pretrained(
-    'Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-static',
 )
 ```
 Notes:
  - The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.

 ### QuantizationAwareTraining
+This is an INT8  PyTorch model quantized by [intel/nlp-toolkit](https://github.com/intel/nlp-toolkit) using provider: [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
+The original fp32 model comes from the fine-tuned model [Intel/bert-base-uncased-mrpc](https://huggingface.co/Intel/bert-base-uncased-mrpc).
 #### Training hyperparameters
 | **Model size (MB)**  |174|418|
 ### Load with nlp-toolkit:
 ```python
 from nlp_toolkit import OptimizedModel
 int8_model = OptimizedModel.from_pretrained(
+    'Intel/bert-base-uncased-mrpc-int8-qat',
 )
 ```
 Notes:
  - The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.