TilelliLab
/

Tilelli-llm

Text Generation

small-language-model

mixture-of-experts

negative-results

reproducibility

Model card Files Files and versions

Tilelli-llm / CITATION.cff

TilelliLab's picture

Mirror small files (code, paper, results)

f86dc09 verified 4 days ago

History Blame Contribute Delete

1.04 kB

	cff-version: 1.2.0
	title: "Tilelli — a small routed byte-LM with verifiable claims"
	message: "If you use this kit, please cite it as below."
	version: "0.1.0"
	date-released: "2026-05-24"
	authors:
	- name: "Tilelli LLM Team"
	license: Apache-2.0
	repository-code: "https://github.com/TilelliLab/Tilelli-llm"
	abstract: >
	A 10 M-parameter byte-level language model with a 3-pathway heterogeneous
	block. Trained on a single GPU, runs on a laptop CPU. Every numerical
	claim in the README is bound to a reproduce script that exits non-zero
	if the bundled checkpoint fails to produce the documented number.
	Ships verified positive results (held-out IDK gate, NEO false-inability
	rate) alongside verified negative results (router-entropy is not free
	metacognition at this scale; abstain heads do not transfer modularly;
	the router cannot be retrained on subset distributions without breaking
	generation).
	keywords:
	- small language model
	- mixture of experts
	- routing
	- calibration
	- negative results
	- reproducibility