Original model: xiaol/RWKV-v5-12B-one-state-chat-16k
You can run this model with ai00_rwkv_server.
Although ai00_rwkv_server is mainly for lowend PC, you can run it on servers which are support VULKAN.
To try it in Colab, you should install libnvidia-gl-* :
!apt -y install libnvidia-gl-535
Original model card:
Release date: December 18th
finetuned from the state-of-the-art (SOTA) model, RWKV v5 12B one state base! More details will be provided soon. Enjoy the incredible performance of this model, which is optimized for systems with 24GB of VRAM and supports fp16. It can be fine-tuned using a single A100 GPU. To execute this model, utilize the RWKV Runner tool.
Finetuned from Mobius 12B base
Usage
- RWKV next web
- if use with RWKV runner or ai00 server, change default vocabs(tokenizer) by this one
Important Notes
After overfitting certain instructions and weakening others, it is necessary to use completion or simulate dialogues.
- completion prompt = 'User: make this content longer:\nxxxxxx\n\nAssistant: ok, longer content is'
Data format
<s>User:xxxx\n\n</s>Assistant:xxx\n\n</s>User:xxxx\n\n</s>Assistant:xxx\n\n</s>
If you desire optimal performance to run this model,utilize this format and these vocabs