Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,37 @@
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
tags:
|
5 |
+
- ggml
|
6 |
+
- text-generation
|
7 |
+
- causal-lm
|
8 |
+
- rwkv
|
9 |
license: apache-2.0
|
10 |
+
datasets:
|
11 |
+
- EleutherAI/pile
|
12 |
+
- togethercomputer/RedPajama-Data-1T
|
13 |
---
|
14 |
+
|
15 |
+
**Last updated:** 2023-05-23
|
16 |
+
|
17 |
+
This is [BlinkDL/rwkv-4-pileplus](https://huggingface.co/BlinkDL/rwkv-4-pileplus) converted to GGML for use with rwkv.cpp and KoboldCpp. [rwkv.cpp's conversion instructions](https://github.com/saharNooby/rwkv.cpp#option-32-convert-and-quantize-pytorch-model) were followed.
|
18 |
+
|
19 |
+
Original model card is below.
|
20 |
+
|
21 |
+
* * *
|
22 |
+
|
23 |
+
# RWKV-4 PilePlus
|
24 |
+
|
25 |
+
## Model Description
|
26 |
+
|
27 |
+
RWKV-4-pile models finetuning on [RedPajama + some of Pile v2 = 1.7T tokens]. Updated with 2020+2021+2022 data, and better at all European languages.
|
28 |
+
|
29 |
+
Although some of these are intermedia checkpoints (XXXGtokens means finetuned for XXXG tokens), you can already use them because I am finetuning from Pile models (instead of retraining).
|
30 |
+
|
31 |
+
Note: not instruct tuned yet, and recommended to replace vanilla Pile models.
|
32 |
+
|
33 |
+
7B and 14B coming soon.
|
34 |
+
|
35 |
+
See https://github.com/BlinkDL/RWKV-LM for details.
|
36 |
+
|
37 |
+
Use https://github.com/BlinkDL/ChatRWKV to run it.
|