metadata

base_model: RWKV/rwkv-6-world-1b6
library_name: gguf
license: apache-2.0
quantized_by: Lyte
tags:
  - text-generation
  - rwkv
  - rwkv-6

RWKV-6-World-1.6B-GGUF-Q4_K_M

This repo contains the RWKV-6-World-1.6B-GGUF quantized with the latest llama.cpp(b3651).

How to run the model

Get the latest llama.cpp:

git clone https://github.com/ggerganov/llama.cpp

Download the GGUF file to a new model folder in llama.cpp(choose your quant):

cd llama.cpp
mkdir model
git clone https://huggingface.co/Lyte/RWKV-6-World-1.6B-GGUF/
mv RWKV-6-World-1.6B-GGUF/RWKV-6-World-1.6B-GGUF-Q4_K_M.gguf model/
rm -r RWKV-6-World-1.6B-GGUF

For Windows other than git cloning the repo, you just create the "model" folder inside llama.cpp folder and go to the repo and download the model there.
Now to run the model, you can use the following command:

./llama-cli -m ./model/RWKV-6-World-1.6B-GGUF-Q4_K_M.gguf --in-suffix "Assistant:" --interactive-first -c 1024 -t 0.7 --top-k 50 --top-p 0.95 -n 128 -p "Assistant: Hello, what can i help you with today?\nUser:" -r "User:"