--- pipeline_tag: text-generation tags: - openvino - mpt - sparse - quantization library_name: "OpenVINO" --- See the benchmark scripts in this repo. ``` pip install deepsparse-nightly[llm]==1.6.0.20231120 pip install openvino==2023.3.0 ``` ## Benchmarking 1. Clone this repo 2. Concatenate the big fp32 IR model: ```bash cd ./models/neuralmagic/mpt-7b-gsm8k-pt/fp32 cat openvino_model.bin.part-a* > openvino_model.bin ``` 3. Reproduce NM paper: `deepsparse_reproduce.bash` 4. OV benchmarkapp: `benchmarkapp_*.bash` ## Generating these IRs https://github.com/yujiepan-work/24h1-sparse-quantized-llm-ov