robertgshaw2 commited on
Commit
1865eb1
1 Parent(s): 8a52c2d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -9,7 +9,7 @@ tags:
9
  - int4
10
  ---
11
 
12
- ## zephyr-7b-beta-marlin
13
  This repo contains model files for [OpenHermes-2.5-Mistral-7b](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) optimized for [nm-vllm](https://github.com/neuralmagic/nm-vllm), a high-throughput serving engine for compressed LLMs.
14
 
15
  This model was quantized with [GPTQ](https://arxiv.org/abs/2210.17323) and saved in the Marlin format for efficient 4-bit inference. Marlin is a highly optimized inference kernel for 4 bit models.
 
9
  - int4
10
  ---
11
 
12
+ ## openhermes-2.5-mistral-7b
13
  This repo contains model files for [OpenHermes-2.5-Mistral-7b](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) optimized for [nm-vllm](https://github.com/neuralmagic/nm-vllm), a high-throughput serving engine for compressed LLMs.
14
 
15
  This model was quantized with [GPTQ](https://arxiv.org/abs/2210.17323) and saved in the Marlin format for efficient 4-bit inference. Marlin is a highly optimized inference kernel for 4 bit models.