rwkv-4-pileplus / README.md
BlinkDL's picture
Update README.md
e637e53
metadata
language:
  - en
tags:
  - pytorch
  - text-generation
  - causal-lm
  - rwkv
license: apache-2.0
datasets:
  - EleutherAI/pile
  - togethercomputer/RedPajama-Data-1T

RWKV-4 PilePlus

Model Description

RWKV-4-pile models finetuning on [RedPajama + some of Pile v2 = 1.7T tokens]. Updated with 2020+2021+2022 data, and better at all European languages.

Although some of these are intermedia checkpoints (XXXGtokens means finetuned for XXXG tokens), you can already use them because I am finetuning from Pile models (instead of retraining).

Note: not instruct tuned yet, and recommended to replace vanilla Pile models.

7B and 14B coming soon.

See https://github.com/BlinkDL/RWKV-LM for details.

Use https://github.com/BlinkDL/ChatRWKV to run it.