neuralmagic
/

Mistral-7B-Instruct-v0.3-GPTQ-4bit

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

mgoin commited on May 23, 2024

Commit

2ca503f

·

verified ·

1 Parent(s): 2631819

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -2,9 +2,10 @@
 license: apache-2.0
 base_model: mistralai/Mistral-7B-Instruct-v0.3
 ---
-# [Mistral-7B-Instruct-v0.3](mistralai/Mistral-7B-Instruct-v0.3) quantized to 4bits
-- weight-only quantization via GPTQ to 4bits with group_size=128
 - GPTQ optimized for 99.75% accuracy recovery relative to the unquantized model
 # Open LLM Leaderboard evaluation scores

 license: apache-2.0
 base_model: mistralai/Mistral-7B-Instruct-v0.3
 ---
+# Model Card for Mistral-7B-Instruct-v0.3 quantized to 4bit weights
+- Weight-only quantization of [Mistral-7B-Instruct-v0.3](mistralai/Mistral-7B-Instruct-v0.3) via GPTQ to 4bits with group_size=128
 - GPTQ optimized for 99.75% accuracy recovery relative to the unquantized model
 # Open LLM Leaderboard evaluation scores