macadeliccc
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -41,8 +41,6 @@ Thanks to user [bartowski](https://huggingface.co/bartowski) we now have exllama
|
|
41 |
|
42 |
+ [bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2)
|
43 |
|
44 |
-
His quantizations represent the first ~13B model with GQA support. Check out his repo for more information!
|
45 |
-
|
46 |
| Branch | Bits | lm_head bits | VRAM (4k) | VRAM (16k) | VRAM (32k) | Description |
|
47 |
| ----- | ---- | ------- | ------ | ------ | ------ | ------------ |
|
48 |
| [8_0](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/8_0) | 8.0 | 8.0 | 13.7 GB | 15.1 GB | 17.2 GB | Maximum quality that ExLlamaV2 can produce, near unquantized performance. |
|
@@ -51,6 +49,8 @@ His quantizations represent the first ~13B model with GQA support. Check out his
|
|
51 |
| [4_25](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/4_25) | 4.25 | 6.0 | 8.2 GB | 9.6 GB | 11.7 GB | GPTQ equivalent bits per weight. |
|
52 |
| [3_5](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/3_5) | 3.5 | 6.0 | 7.0 GB | 8.4 GB | 10.5 GB | Lower quality, not recommended. |
|
53 |
|
|
|
|
|
54 |
### GGUF
|
55 |
|
56 |
*Current GGUF [Quantizations](https://huggingface.co/macadeliccc/laser-dolphin-mixtral-2x7b-dpo-GGUF)*
|
|
|
41 |
|
42 |
+ [bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2)
|
43 |
|
|
|
|
|
44 |
| Branch | Bits | lm_head bits | VRAM (4k) | VRAM (16k) | VRAM (32k) | Description |
|
45 |
| ----- | ---- | ------- | ------ | ------ | ------ | ------------ |
|
46 |
| [8_0](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/8_0) | 8.0 | 8.0 | 13.7 GB | 15.1 GB | 17.2 GB | Maximum quality that ExLlamaV2 can produce, near unquantized performance. |
|
|
|
49 |
| [4_25](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/4_25) | 4.25 | 6.0 | 8.2 GB | 9.6 GB | 11.7 GB | GPTQ equivalent bits per weight. |
|
50 |
| [3_5](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/3_5) | 3.5 | 6.0 | 7.0 GB | 8.4 GB | 10.5 GB | Lower quality, not recommended. |
|
51 |
|
52 |
+
His quantizations represent the first ~13B model with GQA support. Check out his repo for more information!
|
53 |
+
|
54 |
### GGUF
|
55 |
|
56 |
*Current GGUF [Quantizations](https://huggingface.co/macadeliccc/laser-dolphin-mixtral-2x7b-dpo-GGUF)*
|