Update README.md
Browse files
README.md
CHANGED
@@ -24,7 +24,7 @@ ctx_len = 1024
|
|
24 |
n_layer = 24
|
25 |
n_embd = 2048
|
26 |
|
27 |
-
New checkpoint: RWKV-4-Pile-1B5-20220929-ctx4096.pth : Fine-tuned to ctx_len = 4096
|
28 |
|
29 |
Final checkpoint: RWKV-4-Pile-1B5-20220903-8040.pth : Trained on the Pile for 332B tokens.
|
30 |
* Pile loss 2.0415
|
|
|
24 |
n_layer = 24
|
25 |
n_embd = 2048
|
26 |
|
27 |
+
New checkpoint: RWKV-4-Pile-1B5-20220929-ctx4096.pth : Fine-tuned to ctx_len = 4096. Use it only when your ctxlen is long. Might be slightly weaker for short ctxlens.
|
28 |
|
29 |
Final checkpoint: RWKV-4-Pile-1B5-20220903-8040.pth : Trained on the Pile for 332B tokens.
|
30 |
* Pile loss 2.0415
|