dranger003
/

Senku-70B-iMat.GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

dranger003 commited on Feb 26

Commit

21352bc

•

1 Parent(s): c68f61a

Update README.md

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -3,8 +3,11 @@ license: cc-by-2.0
 library_name: gguf
 pipeline_tag: text-generation
 ---
-GGUF importance matrix (imatrix) quants for https://huggingface.co/ShinojiResearch/Senku-70B-Full
 | Layers | Context | Template |
 | --- | --- | --- |
 | <pre>80</pre> | <pre>32764</pre> | <pre><\|im_start\|>system<br>{instructions}<\|im_end\|><br><\|im_start\|>user<br>{prompt}<\|im_end\|><br><\|im_start\|>assistant<br>{response}</pre> |

 library_name: gguf
 pipeline_tag: text-generation
 ---
+* GGUF importance matrix (imatrix) quants for https://huggingface.co/ShinojiResearch/Senku-70B-Full
+* The importance matrix was trained for ~50K tokens (105 batches of 512 tokens) using a [general purpose imatrix calibration dataset](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384).
+* The [imatrix is being used on the K-quants](https://github.com/ggerganov/llama.cpp/pull/4930) as well.
+**2024-02-26**: Updating quants - IQ3_M/IQ3_S/IQ3_XS and IQ2_M/IQ2_S (requires latest commit [a33e6a0d](https://github.com/ggerganov/llama.cpp/commit/a33e6a0d2a66104ea9a906bdbf8a94d050189d91)).
 | Layers | Context | Template |
 | --- | --- | --- |
 | <pre>80</pre> | <pre>32764</pre> | <pre><\|im_start\|>system<br>{instructions}<\|im_end\|><br><\|im_start\|>user<br>{prompt}<\|im_end\|><br><\|im_start\|>assistant<br>{response}</pre> |