lewtun
/

quantized-distilbert-banking77

Text Classification

Inference Endpoints

Model card Files Files and versions Community

lewtun HF staff commited on Jun 8, 2022

Commit

17a612f

•

1 Parent(s): cf17f41

Create README.md (#1)

- Create README.md (4c400e643d6c6c2e551ebd7ada51bfd68de105f4)

Files changed (1) hide show

README.md +60 -0

README.md ADDED Viewed

	@@ -0,0 +1,60 @@

+---
+tags:
+- optimum
+datasets:
+- banking77
+metrics:
+- accuracy
+model-index:
+- name: quantized-distilbert-banking77
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: banking77
+      type: banking77
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 0.9244
+---
+# Quantized-distilbert-banking77
+This model is a dynamically quantized version of [optimum/distilbert-base-uncased-finetuned-banking77](https://huggingface.co/optimum/distilbert-base-uncased-finetuned-banking77) on the `banking77` dataset.
+The model was created using the [dynamic-quantization](https://github.com/huggingface/workshops/tree/main/mlops-world) notebook from a workshop presented at MLOps World 2022.
+It achieves the following results on the evaluation set:
+**Accuracy**
+- Vanilla model: 92.5%
+- Quantized model: 92.44%
+> The quantized model achieves 99.72% accuracy of the fp32 model
+**Latency**
+Payload sequence length: 128
+Instance type: AWS c6i.xlarge
+| latency | vanilla transformers | quantized optimum model | improvement |
+|---------|----------------------|-------------------------|-------------|
+| p95     | 63.24ms              | 37.06ms                 | 1.71x       |
+| avg     | 62.87ms              | 37.93ms                 | 1.66x       |
+## How to use
+```python
+from optimum.onnxruntime import ORTModelForSequenceClassification
+from transformers import pipeline, AutoTokenizer
+model = ORTModelForSequenceClassification.from_pretrained("lewtun/quantized-distilbert-banking77")
+tokenizer = AutoTokenizer.from_pretrained("lewtun/quantized-distilbert-banking77")
+classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
+classifier("What is the exchange rate like on this app?")
+```