CISCai
/

Cerebrum-1.0-8x7b-SOTA-GGUF

CISCai commited on May 14, 2024

Commit

215dd0c

verified ·

1 Parent(s): 4d4eda2

remove BOS from llama.cpp example (automatically added by llama.cpp)

Files changed (1) hide show

README.md CHANGED Viewed

@@ -93,7 +93,7 @@ Generated importance matrix file: [Cerebrum-1.0-8x7b.imatrix.dat](https://huggin
 Make sure you are using `llama.cpp` from commit [0becb22](https://github.com/ggerganov/llama.cpp/commit/0becb22ac05b6542bd9d5f2235691aa1d3d4d307) or later.
 ```shell
-./main -ngl 33 -m Cerebrum-1.0-8x7b.IQ2_XS.gguf --override-kv llama.expert_used_count=int:3 --color -c 16384 --temp 0.7 --repeat-penalty 1.0 -n -1 -p "<s>A chat between a user and a thinking artificial intelligence assistant. The assistant describes its thought process and gives helpful and detailed answers to the user's questions.\nUser: {prompt}\nAI:"
 ```
 Change `-ngl 33` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.

 Make sure you are using `llama.cpp` from commit [0becb22](https://github.com/ggerganov/llama.cpp/commit/0becb22ac05b6542bd9d5f2235691aa1d3d4d307) or later.
 ```shell
+./main -ngl 33 -m Cerebrum-1.0-8x7b.IQ2_XS.gguf --override-kv llama.expert_used_count=int:3 --color -c 16384 --temp 0.7 --repeat-penalty 1.0 -n -1 -p "A chat between a user and a thinking artificial intelligence assistant. The assistant describes its thought process and gives helpful and detailed answers to the user's questions.\nUser: {prompt}\nAI:"
 ```
 Change `-ngl 33` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.