Edit model card

RWKV Minipile

Model Specifications

  • Architecture: RWKV
  • Vocabulary Size: 65,536
  • Embedding Size: 768
  • Number of Layers: 12
  • Context Length: 512
  • Data Type: bfloat16
  • Dataset: Minipile
  • Tokens: 20,643,840 (20 Million)

The model underwent a rigorous training regimen, completing 32 epochs to optimize performance.

Wandb

Training progress

Verifying checksums

sha512sum -c sha512sums.txt

Inference

pip install torch numpy

python inference.py
Downloads last month

-

Downloads are not tracked for this model. How to track
Unable to determine this model's library. Check the docs .

Dataset used to train leliuga/rwkv-minipile