robertgshaw2
commited on
Commit
•
aefa346
1
Parent(s):
de47a92
Update README.md
Browse files
README.md
CHANGED
@@ -9,12 +9,12 @@ tags:
|
|
9 |
---
|
10 |
|
11 |
## OpenHermes-2.5-Mistral-7B-pruned50
|
12 |
-
This repo contains model files for [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) optimized for [
|
13 |
|
14 |
This model was pruned with [SparseGPT](https://arxiv.org/abs/2301.00774), using [SparseML](https://github.com/neuralmagic/sparseml).
|
15 |
|
16 |
## Inference
|
17 |
-
Install [
|
18 |
```bash
|
19 |
pip install nm-vllm[sparse]
|
20 |
```
|
|
|
9 |
---
|
10 |
|
11 |
## OpenHermes-2.5-Mistral-7B-pruned50
|
12 |
+
This repo contains model files for [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) optimized for [nm-vllm](https://github.com/neuralmagic/nm-vllm), a high-throughput serving engine for compressed LLMs.
|
13 |
|
14 |
This model was pruned with [SparseGPT](https://arxiv.org/abs/2301.00774), using [SparseML](https://github.com/neuralmagic/sparseml).
|
15 |
|
16 |
## Inference
|
17 |
+
Install [nm-vllm](https://github.com/neuralmagic/nm-vllm) for fast inference and low memory-usage:
|
18 |
```bash
|
19 |
pip install nm-vllm[sparse]
|
20 |
```
|