---
license: apache-2.0
---

# rwkv-4-raven-ggml

GGML models converted from the [`rwkv-4-raven`](https://huggingface.co/BlinkDL/rwkv-4-raven) checkpoints, for use with [`rwkv.cpp`](https://github.com/saharNooby/rwkv.cpp).

These models retain the original models' license (Apache 2.0).

## Available models

| Name                                           | `f32` | `f16` | `Q4_0` | `Q4_1` | `Q4_2` | `Q5_1` | `Q8_0` |
| ---------------------------------------------- | ----- | ----- | ------ | ------ | ------ | ------ | ------ |
| `RWKV-4-Raven-1B5-v11-Eng99-20230425-ctx4096`  | Yes   | Yes   | Yes    | No     | Yes    | Yes    | Yes    |
| `RWKV-4-Raven-1B5-v12-Eng98-20230520-ctx4096`  | Yes   | Yes   | Yes    | No     | Yes    | Yes    | Yes    |
| `RWKV-4-Raven-3B-v11-Eng99-20230425-ctx4096`   | Yes   | Yes   | Yes    | No     | Yes    | Yes    | Yes    |
| `RWKV-4-Raven-3B-v12-Eng98-20230520-ctx4096`   | Yes   | Yes   | Yes    | No     | Yes    | Yes    | Yes    |
| `RWKV-4-Raven-7B-v11x-Eng99-20230429-ctx8192`  | Yes   | Yes   | Yes    | No     | Yes    | Yes    | Yes    |
| `RWKV-4-Raven-7B-v12-Eng98-20230521-ctx8192`   | Yes   | Yes   | No     | No     | No     | No     | No     |
| `RWKV-4-Raven-14B-v11x-Eng99-20230501-ctx8192` | Split | Yes   | Yes    | No     | Yes    | Yes    | Yes    |
| `RWKV-4-Raven-14B-v12-Eng98-20230523-ctx8192`  | Split | Yes   | No     | No     | No     | No     | No     |

- The original PyTorch checkpoints (`.pth`) can be downloaded from the [`rwkv-4-raven`](https://huggingface.co/BlinkDL/rwkv-4-raven) repository.
- All `f32` and `f16` models were converted directly from the PyTorch checkpoints using [rwkv.cpp `convert_pytorch_to_ggml.py`](https://github.com/saharNooby/rwkv.cpp/blob/1c363e6d5f4ec7817ceffeeb17bd972b1ce9d9d0/rwkv/convert_pytorch_to_ggml.py).
- All quantized models were converted directly from their respective `f32` version using [rwkv.cpp `quantize.py`](https://github.com/saharNooby/rwkv.cpp/blob/1c363e6d5f4ec7817ceffeeb17bd972b1ce9d9d0/rwkv/quantize.py). Quantized models were never converted from other quantized models.
- Conversion and quantization took about an hour, and running `git add` on this repository took another hour. (lol)  
  Total time spent was about 2 hours.

- The `f32` version of RWKV-4-Raven-14B is too large to upload to HuggingFace (>50GB), so it has been split into two files. These files must be exactly concatenated together (using a utility like `cat`) to result in the original `.bin`.
- `Q4_1` is not offered because it is an awkward medium between `Q4_0` (fastest speed) and `Q4_2` (lowest size)