ISTA-DASLab
/

Mistral-7B-Instruct-v0.2-AQLM-2Bit-2x8

Text Generation

text-generation-inference

Inference Endpoints

8-bit precision

Model card Files Files and versions Community

SpiridonSunRotator commited on Apr 13

Commit

1ff5b51

•

1 Parent(s): d0778e5

Create README.md

Files changed (1) hide show

README.md +18 -0

README.md ADDED Viewed

	@@ -0,0 +1,18 @@

+---
+library_name: transformers
+tags:
+- mistral
+- finetuned
+- conversational
+- text-generation-inference
+---
+Official [AQLM](https://arxiv.org/abs/2401.06118) quantization of [mistralai/Mistral-7B-Instruct-v0.2
+](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).
+For this quantization, we used 2 codebooks of 8 bits.
+Results:
+| Model      | Quantization | MMLU (5-shot) | Model size, Gb |
+|------|------|-------|------|------|
+|CohereForAI/c4ai-command-r-v01| None | 0.5912 | 14.5 |
+|  | 2x8 | 0.4384 |  2.3 |