--- license: apache-2.0 --- # rwkv-4-raven-ggml GGML models converted from the [`rwkv-4-raven`](https://huggingface.co/BlinkDL/rwkv-4-raven) checkpoints, for use with [`rwkv.cpp`](https://github.com/saharNooby/rwkv.cpp). These models retain the original models' license (Apache 2.0). ## Available models | Name | `f32` | `f16` | `Q4_0` | `Q4_1` | `Q4_2` | `Q5_1` | `Q8_0` | | ---------------------------------------------- | ----- | ----- | ------ | ------ | ------ | ------ | ------ | | `RWKV-4-Raven-1B5-v11-Eng99-20230425-ctx4096` | Yes | Yes | Yes | No | Yes | Yes | Yes | | `RWKV-4-Raven-1B5-v12-Eng98-20230520-ctx4096` | Yes | Yes | Yes | No | Yes | Yes | Yes | | `RWKV-4-Raven-3B-v11-Eng99-20230425-ctx4096` | Yes | Yes | Yes | No | Yes | Yes | Yes | | `RWKV-4-Raven-3B-v12-Eng98-20230520-ctx4096` | Yes | Yes | Yes | No | Yes | Yes | Yes | | `RWKV-4-Raven-7B-v11x-Eng99-20230429-ctx8192` | Yes | Yes | Yes | No | Yes | Yes | Yes | | `RWKV-4-Raven-7B-v12-Eng98-20230521-ctx8192` | Yes | Yes | No | No | No | No | No | | `RWKV-4-Raven-14B-v11x-Eng99-20230501-ctx8192` | Split | Yes | Yes | No | Yes | Yes | Yes | | `RWKV-4-Raven-14B-v12-Eng98-20230523-ctx8192` | Split | Yes | No | No | No | No | No | - The original PyTorch checkpoints (`.pth`) can be downloaded from the [`rwkv-4-raven`](https://huggingface.co/BlinkDL/rwkv-4-raven) repository. - All `f32` and `f16` models were converted directly from the PyTorch checkpoints using [rwkv.cpp `convert_pytorch_to_ggml.py`](https://github.com/saharNooby/rwkv.cpp/blob/1c363e6d5f4ec7817ceffeeb17bd972b1ce9d9d0/rwkv/convert_pytorch_to_ggml.py). - All quantized models were converted directly from their respective `f32` version using [rwkv.cpp `quantize.py`](https://github.com/saharNooby/rwkv.cpp/blob/1c363e6d5f4ec7817ceffeeb17bd972b1ce9d9d0/rwkv/quantize.py). Quantized models were never converted from other quantized models. - Conversion and quantization took about an hour, and running `git add` on this repository took another hour. (lol) Total time spent was about 2 hours. - The `f32` version of RWKV-4-Raven-14B is too large to upload to HuggingFace (>50GB), so it has been split into two files. These files must be exactly concatenated together (using a utility like `cat`) to result in the original `.bin`. - `Q4_1` is not offered because it is an awkward medium between `Q4_0` (fastest speed) and `Q4_2` (lowest size)