File size: 605 Bytes
bab79a6
 
7452581
 
 
 
 
 
d6d5137
 
 
 
 
608c2fe
d6d5137
608c2fe
 
 
 
 
 
 
 
 
df53b5a
c816afa
2d7044b
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
---
pipeline_tag: text-generation
tags:
- openvino
- mpt
- sparse
- quantization
library_name: "OpenVINO"
---

See the benchmark scripts in this repo.

```
pip install deepsparse-nightly[llm]==1.6.0.20231120
pip install openvino==2023.3.0
```

## Benchmarking
1. Clone this repo
2. Concatenate the big fp32 IR model:
```bash
cd ./models/neuralmagic/mpt-7b-gsm8k-pt/fp32
cat openvino_model.bin.part-a* > openvino_model.bin
```
3. Reproduce NM paper: `deepsparse_reproduce.bash`
4. OV benchmarkapp: `benchmarkapp_*.bash`

## Generating these IRs
https://github.com/yujiepan-work/24h1-sparse-quantized-llm-ov