xiaol
/

rwkv-7B-world-novel-128k

Model card Files Files and versions Community

rwkv-7B-world-novel-128k / README.md

xiaol's picture

Update README.md

98f54e2 over 1 year ago

|

1.47 kB

	---
	license: apache-2.0
	datasets:
	- Norquinal/claude_multiround_chat_30k
	- OpenLeecher/Teatime
	---
	We proudly announce this is the world first 128k context model based on RWKV architecture today, 2023-08-10.

	This model trained with instructions datasets and chinese web novel and tradition wuxia,
	more trainning details would be updated.

	Full finetuned using this repo to train 128k context model , 4*A800 40hours with 1.3B tokens.
	https://github.com/SynthiaDL/TrainChatGalRWKV/blob/main/train_world.sh

	![QQ图片20230810153529.jpg](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/d8ekmc4Lfhy2lYEdrRKXz.jpeg)

	Using RWKV Runner https://github.com/josStorer/RWKV-Runner to test this ， use temp 0.1-0.2 topp 0.7 for more precise answer ,temp between 1-2.x is more creatively.

	![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/5zDQVbGb-fX8Y8h98tUF0.png)

	![微信截图_20230810142220.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/u2wA-l1UcW-Mt9KIoa_4q.png)

	![4UYBX4RA0%8PA{1YSSK)AVW.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/gzr8Yt4JRkBz31-ieRSOE.png)

	![QQ图片20230810143840.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/LgEjfHJ7XD7PlGM9b3RAf.png)

	![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/b_6KCBdZKW7Q7HwipxE-l.png)