leafspark commited on
Commit
e33a1af
1 Parent(s): 490f71b

readme: update info

Browse files
Files changed (1) hide show
  1. README.md +3 -7
README.md CHANGED
@@ -22,7 +22,7 @@ Quantizised from [https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat](https://h
22
 
23
  Using llama.cpp [b3026](https://github.com/ggerganov/llama.cpp/releases/tag/b3026) for quantizisation. Given the rapid release of llama.cpp builds, this will likely change over time.
24
 
25
- **If you are using an older quant, please set the metadata KV overrides below.**
26
 
27
  # Usage:
28
 
@@ -85,7 +85,8 @@ Note: Use iMatrix quants only if you can fully offload to GPU, otherwise speed w
85
  |----------|-------------|-----------|--------------------------------------------|-------------|----------|-------|
86
  | BF16 | Available | 439 GB | Lossless :) | Old | No | Q8_0 is sufficient for most cases |
87
  | Q8_0 | Available | 233.27 GB | High quality *recommended* | Updated | Yes | |
88
- | Q5_K_M | Uploading | 155 GB | Medium-low quality | Updated | Yes | |
 
89
  | Q4_K_M | Available | 132 GB | Medium quality *recommended* | Old | No | |
90
  | Q3_K_M | Available | 104 GB | Medium-low quality | Updated | Yes | |
91
  | IQ3_XS | Available | 89.6 GB | Better than Q3_K_M | Old | Yes | |
@@ -101,7 +102,6 @@ Note: Use iMatrix quants only if you can fully offload to GPU, otherwise speed w
101
  | Q5_K_S | |
102
  | Q4_K_S | |
103
  | Q3_K_S | |
104
- | Q6_K | |
105
  | IQ4_XS | |
106
  | IQ2_XS | |
107
  | IQ2_S | |
@@ -118,10 +118,6 @@ deepseek2.leading_dense_block_count=int:1
118
  deepseek2.rope.scaling.yarn_log_multiplier=float:0.0707
119
  ```
120
 
121
- Quants with "Updated" metadata contain these parameters, so as long as you're running a supported build of llama.cpp no `--override-kv` parameters are required.
122
-
123
- A precompiled Windows AVX2 version is avaliable at `llama.cpp-039896407afd40e54321d47c5063c46a52da3e01.zip` in the root of this repo.
124
-
125
  # License:
126
  - DeepSeek license for model weights, which can be found in the `LICENSE` file in the root of this repo
127
  - MIT license for any repo code
 
22
 
23
  Using llama.cpp [b3026](https://github.com/ggerganov/llama.cpp/releases/tag/b3026) for quantizisation. Given the rapid release of llama.cpp builds, this will likely change over time.
24
 
25
+ **Please set the metadata KV overrides below.**
26
 
27
  # Usage:
28
 
 
85
  |----------|-------------|-----------|--------------------------------------------|-------------|----------|-------|
86
  | BF16 | Available | 439 GB | Lossless :) | Old | No | Q8_0 is sufficient for most cases |
87
  | Q8_0 | Available | 233.27 GB | High quality *recommended* | Updated | Yes | |
88
+ | Q8_0 | Available | ~110 GB | High quality *recommended* | Updated | Yes | |
89
+ | Q5_K_M | Available | 155 GB | Medium-high quality *recommended* | Updated | Yes | |
90
  | Q4_K_M | Available | 132 GB | Medium quality *recommended* | Old | No | |
91
  | Q3_K_M | Available | 104 GB | Medium-low quality | Updated | Yes | |
92
  | IQ3_XS | Available | 89.6 GB | Better than Q3_K_M | Old | Yes | |
 
102
  | Q5_K_S | |
103
  | Q4_K_S | |
104
  | Q3_K_S | |
 
105
  | IQ4_XS | |
106
  | IQ2_XS | |
107
  | IQ2_S | |
 
118
  deepseek2.rope.scaling.yarn_log_multiplier=float:0.0707
119
  ```
120
 
 
 
 
 
121
  # License:
122
  - DeepSeek license for model weights, which can be found in the `LICENSE` file in the root of this repo
123
  - MIT license for any repo code