Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ tags:
|
|
9 |
---
|
10 |
|
11 |
## llama2.c-stories110M-pruned50
|
12 |
-
This repo contains model files for [llama2.c 110M tinystories](https://huggingface.co/Xenova/llama2.c-stories110M) optimized for [NM-vLLM](https://github.com/neuralmagic/nm-vllm), a high-throughput serving engine for compressed
|
13 |
|
14 |
This model was pruned with [SparseGPT](https://arxiv.org/abs/2301.00774), using [SparseML](https://github.com/neuralmagic/sparseml).
|
15 |
|
|
|
9 |
---
|
10 |
|
11 |
## llama2.c-stories110M-pruned50
|
12 |
+
This repo contains model files for [llama2.c 110M tinystories](https://huggingface.co/Xenova/llama2.c-stories110M) optimized for [NM-vLLM](https://github.com/neuralmagic/nm-vllm), a high-throughput serving engine for compressed LLMs.
|
13 |
|
14 |
This model was pruned with [SparseGPT](https://arxiv.org/abs/2301.00774), using [SparseML](https://github.com/neuralmagic/sparseml).
|
15 |
|