CPUFlow v9 — RAM-Net Sparse Memory

CPUFlow v5-LN backbone with RAM-Net Product Softmax sparse memory and contrastive entity routing. Low PPL (9.67) but entity addresses collapsed and generation is incoherent.

Results

Metric	Value
Val PPL	9.67
Parameters	2.57M
Training speed	3,600 tok/s
Training time	2 hours
Hardware	4 vCPU (Lightning AI free tier)
Coherent?	No

Architecture

Same as v9.7 but with contrastive entity routing loss that attempted to make different entities route to different memory slots. The routing collapsed — all entities ended up addressing the same ~3 slots regardless of the push-apart loss.

Why incoherent?

The contrastive routing loss conflicted with the CE training signal. CE pushes toward concentrating memory writes (better prediction), while contrastive pushes toward spreading them apart (entity specificity). CE won. Entity addresses collapsed, and the resulting model is similar to v8 — low PPL but incoherent.

v9.7 removed the contrastive loss entirely and got better results (PPL 10.23, semi-coherent).

Usage

import torch
from tokenizers import Tokenizer

tokenizer = Tokenizer.from_file("tokenizer.json")
checkpoint = torch.load("best.pt", map_location="cpu")
# Build model (see train_cpuflow_v9_ram.py for full architecture)

See GitHub for full training code.

Citation

@misc{Chang,
  title        = {FlashLM: CPU-Native Language Models Trained From Scratch on Free-Tier Hardware},
  author       = {Chang, Cheng},
  year         = {2026},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.20113960}
}

MIT License.

Downloads last month: -; Downloads are not tracked for this model. How to track