BlinkDL commited on
Commit
caeb851
1 Parent(s): 09f4c63

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -35,4 +35,18 @@ Final checkpoint: RWKV-4-Pile-3B-20221008-8023.pth : Trained on the Pile for 331
35
  * Hellaswag acc_norm 59.63%
36
  * ctx_len = 1024 n_layer = 32 n_embd = 2560
37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  ## Note: 4 / 4a / 4b models ARE NOT compatible. Use RWKV-4 unless you know what you are doing.
 
35
  * Hellaswag acc_norm 59.63%
36
  * ctx_len = 1024 n_layer = 32 n_embd = 2560
37
 
38
+ ### Instruct-test models: only useful if you construct your prompt following dataset templates
39
+
40
+ RWKV-4-Pile-3B-Instruct-test1
41
+ instruct-tuned on https://huggingface.co/datasets/bigscience/xP3all/viewer/en/train
42
+
43
+ RWKV-4-Pile-3B-Instruct-test2
44
+ instruct-tuned on https://huggingface.co/datasets/Muennighoff/flan & NIv2
45
+
46
+ ### Chinese models
47
+
48
+ RWKV-4-Pile-3B-EngChn-testNovel-xxx for writing Chinese novels (trained on 200G Chinese novels.)
49
+
50
+ RWKV-4-Pile-3B-EngChn-testxxx for Chinese Q&A (trained on 10G Chinese text. only for testing purposes.)
51
+
52
  ## Note: 4 / 4a / 4b models ARE NOT compatible. Use RWKV-4 unless you know what you are doing.