wolfram commited on
Commit
4f774f4
1 Parent(s): 56a790f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -19,7 +19,7 @@ tags:
19
 
20
  - HF: [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0)
21
  - GGUF: [Q2_K | IQ3_XXS | Q4_K_M | Q5_K_M](https://huggingface.co/wolfram/miquliz-120b-v2.0-GGUF)
22
- - EXL2: [2.4bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-2.4bpw-h6-exl2) | [2.65bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-2.65bpw-h6-exl2) | [3.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-3.0bpw-h6-exl2) | [3.5bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-3.5bpw-h6-exl2) | [4.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-4.0bpw-h6-exl2) | [5.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-5.0bpw-h6-exl2)
23
  - **Max Context w/ 48 GB VRAM:** (24 GB VRAM is not enough, even for 2.4bpw, use [GGUF](https://huggingface.co/wolfram/miquliz-120b-v2.0-GGUF) instead!)
24
  - **2.4bpw:** 32K (32768 tokens) w/ 8-bit cache, 21K (21504 tokens) w/o 8-bit cache
25
  - **2.65bpw:** 30K (30720 tokens) w/ 8-bit cache, 15K (15360 tokens) w/o 8-bit cache
 
19
 
20
  - HF: [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0)
21
  - GGUF: [Q2_K | IQ3_XXS | Q4_K_M | Q5_K_M](https://huggingface.co/wolfram/miquliz-120b-v2.0-GGUF)
22
+ - EXL2: 2.4bpw | [2.65bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-2.65bpw-h6-exl2) | [3.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-3.0bpw-h6-exl2) | [3.5bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-3.5bpw-h6-exl2) | [4.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-4.0bpw-h6-exl2) | [5.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-5.0bpw-h6-exl2)
23
  - **Max Context w/ 48 GB VRAM:** (24 GB VRAM is not enough, even for 2.4bpw, use [GGUF](https://huggingface.co/wolfram/miquliz-120b-v2.0-GGUF) instead!)
24
  - **2.4bpw:** 32K (32768 tokens) w/ 8-bit cache, 21K (21504 tokens) w/o 8-bit cache
25
  - **2.65bpw:** 30K (30720 tokens) w/ 8-bit cache, 15K (15360 tokens) w/o 8-bit cache