szl-governed-norm

The first governed kernel on the Hugging Face Kernel Hub. Correctness-verified RMSNorm & LayerNorm with optional governance receipts that make every call auditable at the kernel layer. (v0.2.0)

Most Kernel Hub kernels compete on raw speed. szl-governed-norm opens a different axis: verifiable provenance. Same clean get_kernel one-liner, plus a SHA3-256 hash-chained audit trail no other kernel ships.

A universal (pure-PyTorch) normalization kernel from SZL Holdings. It gives you a trustworthy reference implementation of RMSNorm and LayerNorm that runs on CPU and CUDA and plays nicely with torch.compile — plus an opt-in governed mode that emits content-addressed, SHA3-256 hash-chained receipts of each normalization call.

What it is

szl-governed-norm is a Kernel Hub kernel built for two things people actually need from a normalization layer:

A correctness reference you can trust. RMSNorm and LayerNorm are implemented in pure PyTorch, computed in float32 for numerical stability and cast back to the input dtype (the standard Llama-style convention). They are verified against PyTorch's own references in the test suite.
Provenance you can verify. Run any call with governed=True and the kernel records a small, deterministic receipt — input shape/dtype, eps, and a SHA3-256 digest of the (rounded) output — hash-chained to the previous receipt. The result is an independently re-walkable audit trail for a sequence of kernel calls.

This is a universal kernel: it ships no hand-tuned CUDA/Triton binary. Its differentiator is verifiable governance, not raw FLOPs.

Quickstart

import torch
from kernels import get_kernel

# Current `kernels` (>=0.15) requires an explicit revision/version + trust flag for org kernels:
gn = get_kernel("SZLHOLDINGS/szl-governed-norm", revision="main", trust_remote_code=True)
# (once a tag is published you can pin it, e.g. revision="v0.2.0")

print(gn.__version__)        # "0.2.0"
print(gn.selfcheck())        # one-shot correctness + receipt verification

x = torch.randn(4, 1024, dtype=torch.float16, device="cuda")
w = torch.ones(1024, dtype=torch.float16, device="cuda")

# Plain path — drop-in normalization.
y = gn.rms_norm(x, weight=w, eps=1e-6)
z = gn.layer_norm(x, weight=w, eps=1e-5)

Governed mode + receipts

# Same math, plus an audit receipt.
y = gn.rms_norm(x, weight=w, eps=1e-6, governed=True)

print(gn.receipt_head())     # SHA3-256 head over all governed calls
print(gn.receipt_verify())   # {'ok': True, 'depth': 1, 'first_break_seq': -1, 'head': '...'}

# Per-call chain (no global state — ideal for concurrent threads/requests):
chain = gn.ReceiptChain()
y = gn.rms_norm(x, weight=w, eps=1e-6, chain=chain)
print(chain.verify())        # (ok, depth, first_break_seq)

Governance is strictly opt-in: with governed=False (the default) nothing is recorded, and the kernel never writes to disk or the network.

API reference

Functional API

Function	Signature	Notes
`rms_norm`	`rms_norm(x, weight=None, eps=1e-6, governed=False, chain=None)`	RMSNorm over the last dim. Emits a receipt when `governed=True` or a `chain` is passed.
`layer_norm`	`layer_norm(x, weight=None, bias=None, eps=1e-5, governed=False, chain=None)`	LayerNorm over the last dim.
`fused_add_rms_norm`	`fused_add_rms_norm(x, residual, weight=None, eps=1e-6, governed=False, chain=None)`	Residual-add + RMSNorm (pre-norm transformer block). Returns `(y, new_residual)`.
`selfcheck`	`selfcheck()`	One-shot correctness + governance check; returns a JSON-able dict, never raises.

All compute in float32 and cast back to the input dtype. rms_norm matches a Llama-style RMSNorm reference; layer_norm matches torch.nn.functional.layer_norm for the last-dim case (verified in tests/, 165 passing).

Governance receipt API

Function	Returns	Description
`receipt_head()`	`str`	SHA3-256 head of the default receipt chain (`"0"*64` if empty).
`receipt_count()`	`int`	Number of governed calls recorded on the default chain.
`receipt_tail(n=10)`	`list[dict]`	The last `n` receipts.
`receipt_verify()`	`dict`	Re-walks the chain; returns `{ok, depth, first_break_seq, head}`.
`ReceiptChain`	class	Construct your own isolated chain (`emit`, `head`, `count`, `tail`, `verify`).

`nn.Module` layers (for the `kernels` layer-mapping mechanism)

Pure torch.nn.Module subclasses (only forward, no custom __init__, no class variables) so they drop in over an existing module:

Layer	Reads from host module
`RMSNorm`	`self.weight` (optional), `self.variance_epsilon` or `self.eps`
`LayerNorm`	`self.weight`/`self.bias` (optional), `self.eps`
`FusedAddRMSNorm`	`self.weight` (optional), `self.variance_epsilon` or `self.eps`

Governed mode — provenance at the kernel layer

When a call runs in governed mode, the kernel builds a receipt body, takes a SHA3-256 digest over its canonical JSON, and links each receipt to the previous one via a prev field — a classic hash chain:

{
  "seq": 0, "op": "rms_norm", "in_shape": [4, 1024], "in_dtype": "float16",
  "eps": 1e-06, "out_digest": "<sha3-256 of the rounded output>", "prev": "<prev digest or 64 zeros>"
}

receipt_verify() re-walks the chain and reports the first break, so tampering with any receipt invalidates everything downstream. This is the same provenance doctrine SZL Holdings applies across its a11oy governed-AI platform — applied here at the lowest layer of the stack, the kernel itself.

Correctness & honesty

Universal, pure-Python kernel — a correctness reference, verified against PyTorch's own references (165 passing tests).
Runs on CPU and CUDA, torch.compile(fullgraph=True)-compatible. Under compile, governed numerics are unchanged but receipt emission (an eager byte-hashing side effect) is skipped — govern at the eager audit boundary.
No fabricated benchmarks. This is not a hand-tuned CUDA/Triton binary; we make no speedup claims.
The receipt digest is an integrity fingerprint, NOT a cryptographic signature. It proves a receipt sequence is internally consistent and untampered — not authorship. DSSE signing is a separate, out-of-band concern.
Governance is opt-in and side-effect-free by default.

Compatibility

Requirement	Version
Python	3.9+
PyTorch	`torch>=2.5`
Dependencies	Python standard library + `torch` only

See it live

This kernel has a 3D holographic showcase Space — the lattice is bound to the 165 passing tests, and the governance receipts are demonstrated interactively:

Showcase: governed-norm-holo
Drive it: receipt-chain-live — run governed ops, build a real SHA-256 receipt chain, tamper a receipt and watch verification fail honestly.
The whole substrate: szl-substrate

Λ governance is advisory (Conjecture 1, uniqueness OPEN) — never "proven trust." Honest BLOCKED beats fake green.

About SZL Holdings

SZL Holdings, founded by Stephen Lutar, builds governed-AI infrastructure — provenance, observability, and security tooling for AI systems. Its work includes the a11oy governed-AI platform and killinchu, 36 public repositories and a large public dataset corpus on the SZL Holdings Hugging Face org, and research published on Zenodo. This kernel applies that same governance doctrine at the level of a single PyTorch operation.

License

_{SZL Holdings · governed normalization · provenance at the kernel layer · a11oy.net · github.com/szl-holdings · huggingface.co/SZLHOLDINGS}

Downloads last month: 13

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

SZLHOLDINGS
/

szl-governed-norm

szl-governed-norm

What it is

Quickstart

Governed mode + receipts

API reference

Functional API

Governance receipt API

`nn.Module` layers (for the `kernels` layer-mapping mechanism)

Governed mode — provenance at the kernel layer

Correctness & honesty

Compatibility

See it live

About SZL Holdings

License

Spaces using SZLHOLDINGS/szl-governed-norm 3

szl-governed-norm

What it is

Quickstart

Governed mode + receipts

API reference

Functional API

Governance receipt API

nn.Module layers (for the kernels layer-mapping mechanism)

Governed mode — provenance at the kernel layer

Correctness & honesty

Compatibility

See it live

About SZL Holdings

License

Spaces using SZLHOLDINGS/szl-governed-norm 3

`nn.Module` layers (for the `kernels` layer-mapping mechanism)