File size: 1,731 Bytes

86918a8
 
7d4fb9d
 
86918a8
 
 
 
4caa150
8d3d329
73b98a9
 
 
4caa150
97577b5
 
 
7d4fb9d
 
 
 
97577b5
5e8a430
97577b5
 
 
 
5c12a1a
97577b5
7d4fb9d
 
 
 
4caa150

---
license: cc-by-nc-4.0
pipeline_tag: text-generation
base_model: CohereForAI/c4ai-command-r-plus
---

# Command R+ GGUF

## Description
This repository contains GGUF weights for `llama.cpp`. Support for them was added in release [`b2636`](https://github.com/ggerganov/llama.cpp/releases/tag/b2636). Since commit `dd2d53a`, all weights in this repo have chat templates.

In the folder `imatrix`, you can find imatrix quants. The importance matrix was trained using [kalomaze's](https://github.com/kalomaze) [`groups_merged.txt`](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384).


## Quickstart
1. Ensure that you have release [`b2636`](https://github.com/ggerganov/llama.cpp/releases/tag/b2636) or newer.
2. Start with the command below:
```bash
./main -p "<|START_OF_TURN_TOKEN|><|USER_TOKEN|>Who are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>" --color -m /path/to/command-r-plus-Q3_K_L-00001-of-00002.gguf
```

## Perplexity on `wikitext-2-raw` [WIP]
| Variant  | PPL Value | Standard Deviation |
|----------|-----------|--------------------|
| Q2_K     | 5.7178    | +/- 0.03418        |
| Q3_K_L   | 4.6214    | +/- 0.02629        |
| Q4_K_M   | 4.4625    | +/- 0.02522        |
| f16      | 4.3845    | +/- 0.02468        |

## Merging Weights
After commit `8a28d12`, weights are split with `gguf-split`, which means that you don't have to merge weights. Simply pass the first split, as in the example above, and `llama.cpp` will automatically load all splits. If, for some reason, you want to merge splits, you can use the following command:
```bash
./gguf-split --merge /path/to/command-r-plus-f16-00001-of-00005.gguf /path/to/command-r-plus-f16-combined.gguf
```