CPUFlow v9 β RAM-Net Sparse Memory
CPUFlow v5-LN backbone with RAM-Net Product Softmax sparse memory and contrastive entity routing. Low PPL (9.67) but entity addresses collapsed and generation is incoherent.
Results
| Metric | Value |
|---|---|
| Val PPL | 9.67 |
| Parameters | 2.57M |
| Training speed | 3,600 tok/s |
| Training time | 2 hours |
| Hardware | 4 vCPU (Lightning AI free tier) |
| Coherent? | No |
Architecture
Same as v9.7 but with contrastive entity routing loss that attempted to make different entities route to different memory slots. The routing collapsed β all entities ended up addressing the same ~3 slots regardless of the push-apart loss.
Why incoherent?
The contrastive routing loss conflicted with the CE training signal. CE pushes toward concentrating memory writes (better prediction), while contrastive pushes toward spreading them apart (entity specificity). CE won. Entity addresses collapsed, and the resulting model is similar to v8 β low PPL but incoherent.
v9.7 removed the contrastive loss entirely and got better results (PPL 10.23, semi-coherent).
Usage
import torch
from tokenizers import Tokenizer
tokenizer = Tokenizer.from_file("tokenizer.json")
checkpoint = torch.load("best.pt", map_location="cpu")
# Build model (see train_cpuflow_v9_ram.py for full architecture)
See GitHub for full training code.
Citation
@misc{Chang,
title = {FlashLM: CPU-Native Language Models Trained From Scratch on Free-Tier Hardware},
author = {Chang, Cheng},
year = {2026},
publisher = {Zenodo},
doi = {10.5281/zenodo.20113960}
}
MIT License.