Text Generation
PyTorch
12 languages
causal-lm
rwkv
File size: 2,103 Bytes
b6a5c87
5d58c65
 
 
 
 
ef91075
5d58c65
 
 
 
 
 
 
 
 
 
 
 
b6a5c87
5d58c65
8411e7f
ef91075
e224013
 
b6a5c87
5d58c65
d9a21d0
 
e224013
f34b018
e224013
f34b018
683dbde
5d58c65
5f12317
 
 
f6f944c
8411e7f
 
 
81a1a9b
361e909
 
5d58c65
 
 
 
4696f7d
ce7047c
4720116
050fd2b
 
 
 
 
1a6472b
 
54a2831
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1a6472b
 
 
 
 
 
 
 
 
54a2831
1a6472b
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
---
language:
- en
- zh
- fr
- es
- de
- pt
- ru
- it
- ja
- ko
- vi
- ar
tags:
- pytorch
- text-generation
- causal-lm
- rwkv
license: apache-2.0
datasets:
- cerebras/SlimPajama-627B
- EleutherAI/pile
- bigcode/starcoderdata
- oscar-corpus/OSCAR-2301
---

# RWKV-5 World (Training in Progress)

## I am now uploading latest training-in-progress checkpts to https://huggingface.co/BlinkDL/temp/tree/main (to avoid bloating git history)

## I am now uploading latest training-in-progress checkpts to https://huggingface.co/BlinkDL/temp/tree/main (to avoid bloating git history)

Use rwkv pip package 0.8.21+ for RWKV-5 inference: https://pypi.org/project/rwkv/

Online 1.5B Demo: https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-1

Online 3B Demo: https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-2

GUI: https://github.com/josStorer/RWKV-Runner (see Releases)

How it works: https://twitter.com/BlinkDL_AI/status/1685230712247795713

https://www.rwkv.com/

## Model Description

RWKV-5 trained on 100+ world languages (70% English, 15% multilang, 15% code).

World = Some_Pile + Some_SlimPajama + Some_StarCoder + Some_OSCAR + All_Wikipedia + All_ChatGPT_Data_I_can_find

RWKV-5 training: set --my_testing "r2r4" in latest RWKV-LM v4neo: https://github.com/BlinkDL/RWKV-LM

World v1 = 0.59T tokens

World v2 = 1.12T tokens

Imagine what happens when we use more data :)

Recommended fine-tuning format (use \n for newlines):
```
User: xxxxxxxxxxxxxxx

Assistant: xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx

User: xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx

Assistant: xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
```

A good chat prompt (better replace \n\n in xxx to \n, such that there will be no newlines in xxx):
```
User: hi

Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

User: xxx

Assistant:
```
QA prompt (better replace \n\n in xxx to \n, such that there will be no newlines in xxx):
```
Question: xxx

Answer:
```
and
```
Instruction: xxx

Input: xxx

Response:
```