Commit
•
d0444a9
1
Parent(s):
a87fd16
Update README.md
Browse files
README.md
CHANGED
@@ -24,6 +24,9 @@ model-index:
|
|
24 |
# Quantized-distilbert-banking77
|
25 |
|
26 |
This model is a statically quantized version of [optimum/distilbert-base-uncased-finetuned-banking77](https://huggingface.co/optimum/distilbert-base-uncased-finetuned-banking77) on the `banking77` dataset.
|
|
|
|
|
|
|
27 |
It achieves the following results on the evaluation set:
|
28 |
|
29 |
**Accuracy**
|
@@ -40,8 +43,8 @@ Instance type: AWS c6i.xlarge
|
|
40 |
|
41 |
| latency | vanilla transformers | quantized optimum model | improvement |
|
42 |
|---------|----------------------|-------------------------|-------------|
|
43 |
-
| p95 |
|
44 |
-
| avg |
|
45 |
|
46 |
## How to use
|
47 |
|
|
|
24 |
# Quantized-distilbert-banking77
|
25 |
|
26 |
This model is a statically quantized version of [optimum/distilbert-base-uncased-finetuned-banking77](https://huggingface.co/optimum/distilbert-base-uncased-finetuned-banking77) on the `banking77` dataset.
|
27 |
+
|
28 |
+
The model was created using the [optimum-static-quantization](https://github.com/philschmid/optimum-static-quantization) notebook.
|
29 |
+
|
30 |
It achieves the following results on the evaluation set:
|
31 |
|
32 |
**Accuracy**
|
|
|
43 |
|
44 |
| latency | vanilla transformers | quantized optimum model | improvement |
|
45 |
|---------|----------------------|-------------------------|-------------|
|
46 |
+
| p95 | 75.69ms | 26.75ms | 2.83x |
|
47 |
+
| avg | 57.52ms | 24.86ms | 2.31x |
|
48 |
|
49 |
## How to use
|
50 |
|