dranger003 commited on
Commit
21352bc
1 Parent(s): c68f61a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -3,8 +3,11 @@ license: cc-by-2.0
3
  library_name: gguf
4
  pipeline_tag: text-generation
5
  ---
6
- GGUF importance matrix (imatrix) quants for https://huggingface.co/ShinojiResearch/Senku-70B-Full
 
 
7
 
 
8
  | Layers | Context | Template |
9
  | --- | --- | --- |
10
  | <pre>80</pre> | <pre>32764</pre> | <pre><\|im_start\|>system<br>{instructions}<\|im_end\|><br><\|im_start\|>user<br>{prompt}<\|im_end\|><br><\|im_start\|>assistant<br>{response}</pre> |
 
3
  library_name: gguf
4
  pipeline_tag: text-generation
5
  ---
6
+ * GGUF importance matrix (imatrix) quants for https://huggingface.co/ShinojiResearch/Senku-70B-Full
7
+ * The importance matrix was trained for ~50K tokens (105 batches of 512 tokens) using a [general purpose imatrix calibration dataset](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384).
8
+ * The [imatrix is being used on the K-quants](https://github.com/ggerganov/llama.cpp/pull/4930) as well.
9
 
10
+ **2024-02-26**: Updating quants - IQ3_M/IQ3_S/IQ3_XS and IQ2_M/IQ2_S (requires latest commit [a33e6a0d](https://github.com/ggerganov/llama.cpp/commit/a33e6a0d2a66104ea9a906bdbf8a94d050189d91)).
11
  | Layers | Context | Template |
12
  | --- | --- | --- |
13
  | <pre>80</pre> | <pre>32764</pre> | <pre><\|im_start\|>system<br>{instructions}<\|im_end\|><br><\|im_start\|>user<br>{prompt}<\|im_end\|><br><\|im_start\|>assistant<br>{response}</pre> |