BlinkDL commited on
Commit
2739b27
1 Parent(s): 6079fbd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -24,9 +24,10 @@ ctx_len = 1024
24
  n_layer = 32
25
  n_embd = 4096
26
 
27
- (There are ctx_len 2048 and 4096 models too. Use them only when your ctxlen is long. Might be slightly weaker for short ctxlens.)
 
28
 
29
- Final checkpoint: RWKV-4-Pile-7B-20221115-8047.pth : Trained on the Pile for 332B tokens.
30
  * Pile loss 1.8415T
31
  * LAMBADA ppl 4.38, acc 67.18%
32
  * PIQA acc 76.06%
 
24
  n_layer = 32
25
  n_embd = 4096
26
 
27
+ RWKV-4-Pile-7B-20230109-ctx4096.pth
28
+ * Likely better. Please test.
29
 
30
+ RWKV-4-Pile-7B-20221115-8047.pth : Trained on the Pile for 332B tokens.
31
  * Pile loss 1.8415T
32
  * LAMBADA ppl 4.38, acc 67.18%
33
  * PIQA acc 76.06%