BlinkDL
/

rwkv-4-pile-14b

Text Generation

Model card Files Files and versions Community

BlinkDL commited on Mar 5, 2023

Commit

a91be67

•

1 Parent(s): f720a95

Update README.md

Files changed (1) hide show

README.md +2 -7

README.md CHANGED Viewed

@@ -20,16 +20,11 @@ RWKV-4 14B is a L40-D5120 causal language model trained on the Pile. See https:/
 Use https://github.com/BlinkDL/ChatRWKV to run it.
-n_layer = 40
-n_embd = 5120
 RWKV-4-Pile-14B-2023xxxx-ctx4096-testxxx.pth : Fine-tuned to ctx_len 4096.
-* ctx_len = 4096
 * Highly recommended. It's great.
-RWKV-4-Pile-14B-20230213-8019.pth : Trained on the Pile for 331B tokens.
-* ctx_len = 1024
-* Pile loss 1.7579
 * LAMBADA ppl 3.81, acc 71.05%
 * PIQA acc 77.42%
 * SC2016 acc 75.57%

 Use https://github.com/BlinkDL/ChatRWKV to run it.
 RWKV-4-Pile-14B-2023xxxx-ctx4096-testxxx.pth : Fine-tuned to ctx_len 4096.
 * Highly recommended. It's great.
+RWKV-4-Pile-14B-20230213-8019.pth : Trained on the Pile for 331B tokens
+* Pile loss 1.7579 (ctx_len 1024)
 * LAMBADA ppl 3.81, acc 71.05%
 * PIQA acc 77.42%
 * SC2016 acc 75.57%