README.md · xiaol/rwkv-7B-world-novel-128k at 2b0513f4151001eb4f9b5c28a7d900c609ac89ca

metadata

license: apache-2.0
datasets:
  - Norquinal/claude_multiround_chat_30k
  - OpenLeecher/Teatime

We proudly announce this is the world first 128k context model based on RWKV architecture today, 2023-08-10.

This model trained with instructions datasets and chinese web novel and tradition wuxia, more trainning details would be updated.

Test input 67k tokens to summary ,can find conversation files in example folders ,more cases are coming.

Full finetuned using this repo to train 128k context model , 4*A800 40hours with 1.3B tokens. https://github.com/SynthiaDL/TrainChatGalRWKV/blob/main/train_world.sh

Using RWKV Runner https://github.com/josStorer/RWKV-Runner to test this ， use temp 0.1-0.2 topp 0.7 for more precise answer ,temp between 1-2.x is more creatively.

67k input test