|
--- |
|
base_model: RWKV/rwkv-6-world-1b6 |
|
library_name: gguf |
|
license: apache-2.0 |
|
quantized_by: Lyte |
|
tags: |
|
- text-generation |
|
- rwkv |
|
- rwkv-6 |
|
--- |
|
|
|
# RWKV-6-World-1.6B-GGUF-Q4_K_M |
|
|
|
This repo contains the RWKV-6-World-1.6B-GGUF quantized with the latest llama.cpp(b3651). |
|
|
|
## How to run the model |
|
|
|
* Get the latest llama.cpp: |
|
``` |
|
git clone https://github.com/ggerganov/llama.cpp |
|
``` |
|
|
|
* Download the GGUF file to a new model folder in llama.cpp(choose your quant): |
|
``` |
|
cd llama.cpp |
|
mkdir model |
|
git clone https://huggingface.co/Lyte/RWKV-6-World-1.6B-GGUF/ |
|
mv RWKV-6-World-1.6B-GGUF/RWKV-6-World-1.6B-GGUF-Q4_K_M.gguf model/ |
|
rm -r RWKV-6-World-1.6B-GGUF |
|
``` |
|
* For Windows other than git cloning the repo, you just create the "model" folder inside llama.cpp folder and go to the repo and download the model there. |
|
|
|
* Now to run the model, you can use the following command: |
|
``` |
|
!./llama-cli -m ./model/RWKV-6-World-1.6B-GGUF-Q4_K_M.gguf --in-prefix "\nUser:" --in-suffix "\nAssistant:" --interactive-first -c 1024 -t 0.7 --top-k 40 --top-p 0.95 -n 64 -p "Assistant: Hello, what can i help you with today?\n" -r "User" |
|
``` |