revise readme
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ tags:
|
|
8 |
library_name: "OpenVINO"
|
9 |
---
|
10 |
|
11 |
-
The intent of this repo is to compare the performance delta between dense quantized MPT-7B and 70% sparse-quantized MPT-7B on OpenVINO. Quantization here is 8-bit on both weight and activation.
|
12 |
|
13 |
Target HW: Intel 4th gen Xeon (Sapphire Rapids)
|
14 |
|
|
|
8 |
library_name: "OpenVINO"
|
9 |
---
|
10 |
|
11 |
+
The intent of this repo is to compare the performance delta between dense quantized MPT-7B and 70% sparse-quantized MPT-7B on OpenVINO. Quantization here is 8-bit on both weight and activation. Benchmark metric is decoding (next token) latency with context length 512.
|
12 |
|
13 |
Target HW: Intel 4th gen Xeon (Sapphire Rapids)
|
14 |
|