See the benchmark scripts in this repo.

pip install deepsparse-nightly[llm]==1.6.0.20231120
pip install openvino==2023.3.0

Benchmarking

cd ./models/neuralmagic/mpt-7b-gsm8k-pt/fp32
cat openvino_model.bin.part-a* > openvino_model.bin

Generating these IRs

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference API (serverless) does not yet support OpenVINO models for this pipeline type.