File size: 1,196 Bytes
ff14fa2
 
 
 
 
 
 
 
 
 
 
594ef11
 
ba73276
594ef11
 
 
 
 
 
 
 
ff14fa2
594ef11
 
 
 
 
 
 
5431dce
594ef11
 
 
b95c7bb
594ef11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
---
base_model: RWKV/rwkv-6-world-1b6
library_name: gguf
license: apache-2.0
quantized_by: Lyte
tags:
- text-generation
- rwkv
- rwkv-6
---

# RWKV-6-World-1.6B-GGUF-Q4_K_M

This repo contains the RWKV-6-World-1.6B-GGUF NEW (RE)-quantized with the latest llama.cpp [b3771](https://github.com/ggerganov/llama.cpp/releases/tag/b3771).

## How to run the model

* Get the latest llama.cpp:
```
git clone https://github.com/ggerganov/llama.cpp
```

* Download the GGUF file to a new model folder in llama.cpp(choose your quant):
```
cd llama.cpp
mkdir model
git clone https://huggingface.co/Lyte/RWKV-6-World-1.6B-GGUF/
mv RWKV-6-World-1.6B-GGUF/RWKV-6-World-1.6B-GGUF-Q4_K_M.gguf model/
rm -r RWKV-6-World-1.6B-GGUF
```
* For Windows other than git cloning the repo, you just create the "model" folder inside llama.cpp folder and in here click "Files and versions" and download the model quant you want there.

* Now to run the model, you can use the following command:
```
./llama-cli -m ./model/RWKV-6-World-1.6B-GGUF-Q4_K_M.gguf --in-suffix "Assistant:" --interactive-first -c 1024 -t 0.7 --top-k 50 --top-p 0.95 -n 128 -p "Assistant: Hello, what can i help you with today?\nUser:" -r "User:"
```