BlinkDL
/

rwkv-5-world

Text Generation

Model card Files Files and versions Community

rwkv-5-world / README.md

BlinkDL's picture

Update README.md

4720116 about 1 year ago

|

1.25 kB

	---
	language:
	- en
	- zh
	- fr
	- es
	- de
	- pt
	- ru
	- it
	- ja
	- ko
	- vi
	- ar
	tags:
	- pytorch
	- text-generation
	- causal-lm
	- rwkv
	license: apache-2.0
	datasets:
	- cerebras/SlimPajama-627B
	- EleutherAI/pile
	---

	# RWKV-5 World (Training in Progress)

	## I am now updating latest training-in-progress checkpts to https://huggingface.co/BlinkDL/temp/tree/main (to avoid bloating git history)

	## I am now updating latest training-in-progress checkpts to https://huggingface.co/BlinkDL/temp/tree/main (to avoid bloating git history)

	Use rwkv pip package 0.8.14+ for RWKV-5 inference: https://pypi.org/project/rwkv/

	Online Demo: https://huggingface.co/spaces/BlinkDL/ChatRWKV-gradio

	GUI: https://github.com/josStorer/RWKV-Runner (see Releases)

	How it works: https://twitter.com/BlinkDL_AI/status/1685230712247795713

	https://www.rwkv.com/

	## Model Description

	RWKV-5 trained on 100+ world languages (70% English, 15% multilang, 15% code).

	World = Some_Pile + Some_SlimPajama + Some_StarCoder + Some_OSCAR + All_Wikipedia + All_ChatGPT_Data_I_can_find

	RWKV-5 training: set --my_testing "r2r4" in latest RWKV-LM v4neo: https://github.com/BlinkDL/RWKV-LM

	World v1 = 0.59T tokens

	World v2 = 1.12T tokens

	Imagine what happens when we use more data :)