Merry's picture
Update README.md
17eb111
|
raw
history blame
1.44 kB
metadata
language:
  - en
tags:
  - ggml
  - text-generation
  - causal-lm
  - rwkv
license: apache-2.0
datasets:
  - EleutherAI/pile
  - togethercomputer/RedPajama-Data-1T

Last updated: 2023-05-23

This is BlinkDL/rwkv-4-pileplus converted to GGML for use with rwkv.cpp and KoboldCpp. rwkv.cpp's conversion instructions were followed.

RAM USAGE (KoboldCpp)

Model RAM usage (with OpenBLAS)
Unloaded 41.3 MiB
169M q4_0 249.0 MiB
169M q5_0 254.2 MiB
169M q5_1 259.6 MiB
430M q4_0 443.7 MiB
430M q5_0 463.2 MiB
430M q5_1 482.8 MiB
1.5B q4_0 1.2 GiB
1.5B q5_0 1.3 GiB
1.5B q5_1 1.4 GiB
3B q4_0 2.1 GiB
3B q5_0 2.3 GiB
3B q5_1 2.5 GiB

Original model card by BlinkDL is below.


RWKV-4 PilePlus

Model Description

RWKV-4-pile models finetuning on [RedPajama + some of Pile v2 = 1.7T tokens]. Updated with 2020+2021+2022 data, and better at all European languages.

Although some of these are intermedia checkpoints (XXXGtokens means finetuned for XXXG tokens), you can already use them because I am finetuning from Pile models (instead of retraining).

Note: not instruct tuned yet, and recommended to replace vanilla Pile models.

7B and 14B coming soon.

See https://github.com/BlinkDL/RWKV-LM for details.

Use https://github.com/BlinkDL/ChatRWKV to run it.