pmysl's picture
Update README.md
7d4fb9d
|
raw
history blame
No virus
1.44 kB
metadata
license: cc-by-nc-4.0
pipeline_tag: text-generation
base_model: CohereForAI/c4ai-command-r-plus

Command R+ GGUF

Description

This repository contains experimental GGUF weights that are currently compatible with pull request #6491 in the llama.cpp. I will update them once support for Command R+ is merged into the llama.cpp repository.

Getting started

  1. Clone the Carolinabanana/llama.cpp repository:
git clone https://github.com/Carolinabanana/llama.cpp.git llama.cpp-fork
cd llama.cpp-fork
git reset --hard 8b6577bd631fec33eeadb4b9dfc5a07ed2118148
  1. Build it using make
  2. Use it in the same way as the regular llama.cpp. If you're unsure of how to start, you can use the following command as a starting point:
./main -p "<|START_OF_TURN_TOKEN|><|USER_TOKEN|>Who are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>" --color -m /path/to/command-r-plus-Q3_K_L-00001-of-00002.gguf

Merging Weights

After commit 8a28d12, weights are split with gguf-split, which means that you don't have to merge weights. Simply pass the first split, as in the example above, and llama.cpp will automatically load all splits. If, for some reason, you want to merge splits, you can use the following command:

./gguf-split --merge /path/to/command-r-plus-f16-00001-of-00005.gguf /path/to/command-r-plus-f16-combined.gguf