nr-network-known-class-detector
A binary attack-vs-benign detector for blockchain-node network/resource attacks, trained
entirely on faithful reproductions of publicly-disclosed attacks. Every attack class in the
corpus reproduces a specific public disclosure — a CVE, a GHSA, or a named third-party security
audit — and each carries a provenance.source_class recording how public its sourcing is. Part of
NullRabbit's work on autonomous defence for decentralised networks — watch the outside of the
perimeter.
STATUS: DIAGNOSTIC, not a deployment claim. Trained on synthetic localnet reproductions (lab fidelity), not real production traffic. See Evaluation and Limitations.
Model description
Given the network-layer signal of a short capture window against a blockchain node (packet-rate /
size statistics from the pcap, and amplification / request-response / timing statistics from the RPC
responses), the model emits a calibrated attack probability. It is one multi-family model over the
network-v1 feature manifold — it spans 9 chains, all public-sourced (this published cut is
trained on public-cve-replication primitives only).
Architecture
HistGradientBoostingClassifier+ isotonic calibration (scikit-learn), NaN-native.- pcap aggregates + RPC-response aggregates, degenerate columns dropped by a per-fit robust-column guard. No host-load features (the containerised lab node is root-owned, so host metrics are unreadable).
- Decision threshold 0.5 (calibrated). Inference is scoreability-gated: a record with no network
signal (e.g. an economic/DeFi bundle) returns
scoreable=Falsewith no verdict.
Training data — 40 public-CVE attack primitives, 9 chains, 1262 bundles
This is the public-CVE cut (public-cve-replication only): 850 attack + 412 benign
bundles (pcap + responses + manifest), 46 chain×primitive instances. Benign traffic
exercises the same methods / wire messages the attacks abuse, at normal scale — so the model
separates attack-use from benign-use, not message type. Every attack reproduces an external public
disclosure with a provenance.public_source URL. (0 original primitives — NullRabbit's own
measurement of vendor-acknowledged RPC amplification, for which no CVE exists — are held in the full
corpus but excluded from this published cut, so the "trained entirely on public disclosures" claim
above is literal.)
| primitive | chain · layer | public source | source_class |
|---|---|---|---|
btc_addr_overflow_flood |
Bitcoin / Dogecoin / Litecoin · p2p | CVE-2024-52919 | public-cve-replication |
btc_alert_flood |
Bitcoin · p2p | CVE-2016-10724 | public-cve-replication |
btc_blocktxn_double_fillblock |
Bitcoin · p2p | CVE-2024-35202 | public-cve-replication |
btc_bloom_divzero |
Bitcoin · p2p | CVE-2013-5700 | public-cve-replication |
btc_cmpctblock_overflow |
Bitcoin · p2p | CVE-2025-46597 | public-cve-replication |
btc_cmpctblock_stall |
Bitcoin · p2p | CVE-2024-52922 | public-cve-replication |
btc_getdata_flood |
Bitcoin · p2p | CVE-2024-52920 | public-cve-replication |
btc_headers_genesis_spam |
Bitcoin · p2p | CVE-2024-52916 | public-cve-replication |
btc_headers_oom |
Bitcoin · p2p | CVE-2019-25220 | public-cve-replication |
btc_inv_buffer_blowup |
Bitcoin · p2p | CVE-2024-52915 | public-cve-replication |
btc_inv_eviction_jam |
Bitcoin · p2p | CVE-2024-52913 | public-cve-replication |
btc_invalid_block_logfill |
Bitcoin · p2p | CVE-2025-54605 | public-cve-replication |
btc_invdos_flood |
Bitcoin · p2p | CVE-2018-17145 | public-cve-replication |
btc_mutated_block |
Bitcoin · p2p | CVE-2024-52921 | public-cve-replication |
btc_orphan_cpu |
Bitcoin / Dogecoin / Litecoin · p2p | CVE-2024-52914 | public-cve-replication |
btc_oversized_recv_buffer |
Bitcoin · p2p | CVE-2015-3641 | public-cve-replication |
btc_tx_maprelay |
Bitcoin · p2p | CVE-2013-4627 | public-cve-replication |
btc_tx_quad_sighash |
Bitcoin · p2p | CVE-2025-46598 | public-cve-replication |
btc_version_selfnonce |
Bitcoin · p2p | CVE-2025-54604 | public-cve-replication |
btc_version_timestamp_overflow |
Bitcoin · p2p | CVE-2024-52912 | public-cve-replication |
p2p_getheaders_drain |
Bitcoin / Dogecoin / Litecoin · p2p | CVE-2023-33297 | public-cve-replication |
cosmos_p2p_conn_flood |
Cosmos · tcp-p2p-conn-flood | CVE-2020-5303 | public-cve-replication |
cosmos_protobuf_nest_bomb |
Cosmos · rpc-broadcast-tx | GHSA-8wcc-m6j2-qxvm | public-cve-replication |
geth_devp2p_ping_flood |
Ethereum · devp2p-rlpx | CVE-2023-40591 | public-cve-replication |
geth_eth_receipt_flood |
Ethereum · devp2p-rlpx | EL-2024-20 | public-cve-replication |
geth_rlpx_auth_flood |
Ethereum · devp2p-rlpx | EL-2026-06 | public-cve-replication |
geth_snap_trienode_dos |
Ethereum · devp2p-rlpx | CVE-2021-41173 | public-cve-replication |
geth_tcp_handshake_flood |
Ethereum · devp2p-rlpx | EL-2024-06 | public-cve-replication |
monero_levin_array_memcorrupt |
Monero · levin-p2p | CVE-2018-3972 | public-cve-replication |
monero_portable_storage_oom |
Monero · levin-p2p | PR#7190 | public-cve-replication |
sol_tpu_quic_handshake_flood |
Solana · tpu-quic | ND-FD04-LO-01 | public-cve-replication |
sol_tpu_quic_initial_cpu |
Solana · tpu-quic | ND-FD1-MD-02 | public-cve-replication |
sol_tpu_quic_slowloris |
Solana · tpu-quic | ND-FD04-IN-02 | public-cve-replication |
sui_disassemble_panic |
Sui · json-rpc | CertiK Skyfall | public-cve-replication |
sui_move_recursion |
Sui · json-rpc | CVE-2023-36184 | public-cve-replication |
sui_verifier_hamsterwheel |
Sui · json-rpc | CertiK Skyfall HamsterWheel | public-cve-replication |
gossipsub_prune_backoff_overflow |
libp2p · libp2p-gossipsub | CVE-2026-34219 | public-cve-replication |
gossipsub_subscribe_flood |
libp2p · libp2p-gossipsub | CVE-2026-46679 | public-cve-replication |
libp2p_signed_peer_record_flood |
libp2p · libp2p-gossipsub | CVE-2023-40583 | public-cve-replication |
libp2p_stream_exhaustion |
libp2p · libp2p-gossipsub | CVE-2022-23492 | public-cve-replication |
Distribution: 850 public-cve-replication attack bundles — 40 distinct primitives
across 9 chains (Bitcoin, Cosmos, Dogecoin, Ethereum, libp2p, Litecoin, Monero, Solana, Sui) — plus 412 benign. This published cut contains
no original bundles; the original RPC-measurement primitives live in the full internal corpus
and ship only if the operator explicitly opts in, always under their honest label.
Training procedure (methodology is the contribution)
Per NullRabbit's pre-registration discipline: the corpus is built attack-by-attack from a public
disclosure with provenance.public_source; a Cleanlab data-quality scan gates label-issues and
duplicates before training; a methodology auditor reviews each gate event with sanity floors and
falsification holdouts; honest limitations are stated; cycles — not the final number — are the
contribution. This card + model are regenerated automatically from the cut on each training run, so
the numbers below always match the shipped model.
Evaluation
Diagnostic ML checks (the corpus of faithfully-modelled public attacks is the deliverable; these are
secondary). Reproduced by scripts/known_class_loco_eval.py + scripts/corpus_quality.py.
- Within-corpus held-out — binary attack-vs-benign ROC-AUC, GroupKFold by primitive: 0.8985.
corpus_sha256 known-class-v10-publiccve. - Leave-one-chain-out — binary ROC-AUC (HARD zero-shot transfer, not a deployment metric): Cosmos 1.000 / Litecoin 1.000 / Dogecoin 0.997 / Ethereum 0.969 / Bitcoin 0.910 / Sui 0.779 / libp2p 0.762 / Solana 0.641 / Monero 0.631. Chains with few public-CVE primitives have the fewest cross-chain near-neighbours; the companion
nr-bundles-publicdataset card reports the stricter held-out-chain 7-class family macro-F1 (0.17 Sui / 0.35 Solana vs ~0.14 floor). Reported honestly, not a deployment claim. - Leave-one-attack-primitive-out within Bitcoin (leak-clean disjoint-benign): all Bitcoin primitives ≥ 0.997. Detection is on traffic shape, not deep wire-semantics.
Intended uses
Research and benchmarking of network/resource-abuse detection on blockchain infrastructure; a worked, public-provenance reference corpus; downstream training. Not a turnkey production IDS.
Limitations
- Synthetic lab fidelity — generated localnet traffic, not a real-world deployment claim. A deployment claim needs a real-traffic validation gate (real mainnet RPC + real attack instances).
- Detection is on traffic shape (volume / rate / size / connection-churn), not deep wire semantics — adequate for these volumetric/crash DoS classes; it would not separate two attacks with identical traffic profiles.
- No host-load features (root-owned container).
- This is the public-CVE cut — every shipped attack class reproduces an external public disclosure.
The
originalRPC-amplification measurements (vendor-acknowledged but not CVE-backed) are excluded from this model; they exist in the full internal corpus and ship only on explicit operator opt-in.
How to use
from predict import load, predict
model = load("model.joblib")
out = predict(model, [{"pcap.packet_rate": 850.0, "resp.amp_ratio_max": 224.0}])
# -> [{"scoreable": True, "score": ..., "verdict": "attack"|"benign", "threshold": 0.5}]
Run python inference_example.py for a worked example on real captured vectors.
Licensing
Apache-2.0 (see LICENSE). Attribution appreciated.
Citation
@software{nullrabbit_network_known_class_2026,
author = {NullRabbit Labs},
title = {nr-network-known-class-detector: a public-provenance blockchain network-attack detector},
year = {2026},
url = {https://huggingface.co/NullRabbit/nr-network-known-class-detector}
}
Related: the open bundle format (nr-bundle-spec), the family taxonomy (mechanism-defined),
the earned-autonomy framework (Zenodo 10.5281/zenodo.18406828),
the dataset NullRabbit/nr-bundles-public, and nullrabbit.ai.
Contact
NullRabbit Labs — huggingface.co/NullRabbit · nullrabbit.ai