dranger003 commited on
Commit
17e1904
1 Parent(s): 19fa7c3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -10,6 +10,8 @@ base_model: CohereForAI/c4ai-command-r-plus
10
  Noeda's fork will not work with these weights, you will need the main branch of llama.cpp.
11
  Also, I am currently running perplexity on all the quants posted here, and will update this model page with the results.
12
 
 
 
13
  * GGUF importance matrix (imatrix) quants for https://huggingface.co/CohereForAI/c4ai-command-r-plus
14
  * The importance matrix is trained for ~100K tokens (200 batches of 512 tokens) using [wiki.train.raw](https://huggingface.co/datasets/wikitext).
15
  * [Which GGUF is right for me? (from Artefact2)](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) - X axis is file size and Y axis is perplexity (lower perplexity is better quality). Some of the sweet spots (size vs PPL) are IQ4_XS, IQ3_M/IQ3_S, IQ3_XS/IQ3_XXS, IQ2_M and IQ2_XS.
 
10
  Noeda's fork will not work with these weights, you will need the main branch of llama.cpp.
11
  Also, I am currently running perplexity on all the quants posted here, and will update this model page with the results.
12
 
13
+ **NOTE**: Do not concatenate splits (or chunks) - you need to use `gguf-split` to merge files if you need to (most likely not needed for most use cases).
14
+
15
  * GGUF importance matrix (imatrix) quants for https://huggingface.co/CohereForAI/c4ai-command-r-plus
16
  * The importance matrix is trained for ~100K tokens (200 batches of 512 tokens) using [wiki.train.raw](https://huggingface.co/datasets/wikitext).
17
  * [Which GGUF is right for me? (from Artefact2)](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) - X axis is file size and Y axis is perplexity (lower perplexity is better quality). Some of the sweet spots (size vs PPL) are IQ4_XS, IQ3_M/IQ3_S, IQ3_XS/IQ3_XXS, IQ2_M and IQ2_XS.