--- language: - en tags: - ggml - text-generation - causal-lm - rwkv license: apache-2.0 datasets: - EleutherAI/pile - togethercomputer/RedPajama-Data-1T --- **Last updated:** 2023-05-23 This is [BlinkDL/rwkv-4-pileplus](https://huggingface.co/BlinkDL/rwkv-4-pileplus) converted to GGML for use with rwkv.cpp and KoboldCpp. [rwkv.cpp's conversion instructions](https://github.com/saharNooby/rwkv.cpp#option-32-convert-and-quantize-pytorch-model) were followed. ### RAM USAGE (KoboldCpp) Model | RAM usage (with OpenBLAS) :--:|:--: Unloaded | 41.3 MiB 169M q4_0 | 249.0 MiB 169M q5_0 | 254.2 MiB 169M q5_1 | 259.6 MiB 430M q4_0 | 443.7 MiB 430M q5_0 | 463.2 MiB 430M q5_1 | 482.8 MiB 1.5B q4_0 | 1.2 GiB 1.5B q5_0 | 1.3 GiB 1.5B q5_1 | 1.4 GiB 3B q4_0 | 2.1 GiB 3B q5_0 | 2.3 GiB 3B q5_1 | 2.5 GiB Original model card by BlinkDL is below. * * * # RWKV-4 PilePlus ## Model Description RWKV-4-pile models finetuning on [RedPajama + some of Pile v2 = 1.7T tokens]. Updated with 2020+2021+2022 data, and better at all European languages. Although some of these are intermedia checkpoints (XXXGtokens means finetuned for XXXG tokens), you can already use them because I am finetuning from Pile models (instead of retraining). Note: not instruct tuned yet, and recommended to replace vanilla Pile models. 7B and 14B coming soon. See https://github.com/BlinkDL/RWKV-LM for details. Use https://github.com/BlinkDL/ChatRWKV to run it.