dranger003
commited on
Commit
•
21352bc
1
Parent(s):
c68f61a
Update README.md
Browse files
README.md
CHANGED
@@ -3,8 +3,11 @@ license: cc-by-2.0
|
|
3 |
library_name: gguf
|
4 |
pipeline_tag: text-generation
|
5 |
---
|
6 |
-
GGUF importance matrix (imatrix) quants for https://huggingface.co/ShinojiResearch/Senku-70B-Full
|
|
|
|
|
7 |
|
|
|
8 |
| Layers | Context | Template |
|
9 |
| --- | --- | --- |
|
10 |
| <pre>80</pre> | <pre>32764</pre> | <pre><\|im_start\|>system<br>{instructions}<\|im_end\|><br><\|im_start\|>user<br>{prompt}<\|im_end\|><br><\|im_start\|>assistant<br>{response}</pre> |
|
|
|
3 |
library_name: gguf
|
4 |
pipeline_tag: text-generation
|
5 |
---
|
6 |
+
* GGUF importance matrix (imatrix) quants for https://huggingface.co/ShinojiResearch/Senku-70B-Full
|
7 |
+
* The importance matrix was trained for ~50K tokens (105 batches of 512 tokens) using a [general purpose imatrix calibration dataset](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384).
|
8 |
+
* The [imatrix is being used on the K-quants](https://github.com/ggerganov/llama.cpp/pull/4930) as well.
|
9 |
|
10 |
+
**2024-02-26**: Updating quants - IQ3_M/IQ3_S/IQ3_XS and IQ2_M/IQ2_S (requires latest commit [a33e6a0d](https://github.com/ggerganov/llama.cpp/commit/a33e6a0d2a66104ea9a906bdbf8a94d050189d91)).
|
11 |
| Layers | Context | Template |
|
12 |
| --- | --- | --- |
|
13 |
| <pre>80</pre> | <pre>32764</pre> | <pre><\|im_start\|>system<br>{instructions}<\|im_end\|><br><\|im_start\|>user<br>{prompt}<\|im_end\|><br><\|im_start\|>assistant<br>{response}</pre> |
|