nr-network-known-class-detector

A binary attack-vs-benign detector for blockchain-node network/resource attacks, trained entirely on faithful reproductions of publicly-disclosed attacks. Every attack class in the corpus reproduces a specific public disclosure — a CVE, a GHSA, or a named third-party security audit — and each carries a provenance.source_class recording how public its sourcing is. Part of NullRabbit's work on autonomous defence for decentralised networkswatch the outside of the perimeter.

STATUS: DIAGNOSTIC, not a deployment claim. Trained on synthetic localnet reproductions (lab fidelity), not real production traffic. See Evaluation and Limitations.

Model description

Given the network-layer signal of a short capture window against a blockchain node (packet-rate / size statistics from the pcap, and amplification / request-response / timing statistics from the RPC responses), the model emits a calibrated attack probability. It is one multi-family model over the network-v1 feature manifold — it spans 9 chains, all public-sourced (this published cut is trained on public-cve-replication primitives only).

Architecture

  • HistGradientBoostingClassifier + isotonic calibration (scikit-learn), NaN-native.
  • pcap aggregates + RPC-response aggregates, degenerate columns dropped by a per-fit robust-column guard. No host-load features (the containerised lab node is root-owned, so host metrics are unreadable).
  • Decision threshold 0.5 (calibrated). Inference is scoreability-gated: a record with no network signal (e.g. an economic/DeFi bundle) returns scoreable=False with no verdict.

Training data — 40 public-CVE attack primitives, 9 chains, 1262 bundles

This is the public-CVE cut (public-cve-replication only): 850 attack + 412 benign bundles (pcap + responses + manifest), 46 chain×primitive instances. Benign traffic exercises the same methods / wire messages the attacks abuse, at normal scale — so the model separates attack-use from benign-use, not message type. Every attack reproduces an external public disclosure with a provenance.public_source URL. (0 original primitives — NullRabbit's own measurement of vendor-acknowledged RPC amplification, for which no CVE exists — are held in the full corpus but excluded from this published cut, so the "trained entirely on public disclosures" claim above is literal.)

primitive chain · layer public source source_class
btc_addr_overflow_flood Bitcoin / Dogecoin / Litecoin · p2p CVE-2024-52919 public-cve-replication
btc_alert_flood Bitcoin · p2p CVE-2016-10724 public-cve-replication
btc_blocktxn_double_fillblock Bitcoin · p2p CVE-2024-35202 public-cve-replication
btc_bloom_divzero Bitcoin · p2p CVE-2013-5700 public-cve-replication
btc_cmpctblock_overflow Bitcoin · p2p CVE-2025-46597 public-cve-replication
btc_cmpctblock_stall Bitcoin · p2p CVE-2024-52922 public-cve-replication
btc_getdata_flood Bitcoin · p2p CVE-2024-52920 public-cve-replication
btc_headers_genesis_spam Bitcoin · p2p CVE-2024-52916 public-cve-replication
btc_headers_oom Bitcoin · p2p CVE-2019-25220 public-cve-replication
btc_inv_buffer_blowup Bitcoin · p2p CVE-2024-52915 public-cve-replication
btc_inv_eviction_jam Bitcoin · p2p CVE-2024-52913 public-cve-replication
btc_invalid_block_logfill Bitcoin · p2p CVE-2025-54605 public-cve-replication
btc_invdos_flood Bitcoin · p2p CVE-2018-17145 public-cve-replication
btc_mutated_block Bitcoin · p2p CVE-2024-52921 public-cve-replication
btc_orphan_cpu Bitcoin / Dogecoin / Litecoin · p2p CVE-2024-52914 public-cve-replication
btc_oversized_recv_buffer Bitcoin · p2p CVE-2015-3641 public-cve-replication
btc_tx_maprelay Bitcoin · p2p CVE-2013-4627 public-cve-replication
btc_tx_quad_sighash Bitcoin · p2p CVE-2025-46598 public-cve-replication
btc_version_selfnonce Bitcoin · p2p CVE-2025-54604 public-cve-replication
btc_version_timestamp_overflow Bitcoin · p2p CVE-2024-52912 public-cve-replication
p2p_getheaders_drain Bitcoin / Dogecoin / Litecoin · p2p CVE-2023-33297 public-cve-replication
cosmos_p2p_conn_flood Cosmos · tcp-p2p-conn-flood CVE-2020-5303 public-cve-replication
cosmos_protobuf_nest_bomb Cosmos · rpc-broadcast-tx GHSA-8wcc-m6j2-qxvm public-cve-replication
geth_devp2p_ping_flood Ethereum · devp2p-rlpx CVE-2023-40591 public-cve-replication
geth_eth_receipt_flood Ethereum · devp2p-rlpx EL-2024-20 public-cve-replication
geth_rlpx_auth_flood Ethereum · devp2p-rlpx EL-2026-06 public-cve-replication
geth_snap_trienode_dos Ethereum · devp2p-rlpx CVE-2021-41173 public-cve-replication
geth_tcp_handshake_flood Ethereum · devp2p-rlpx EL-2024-06 public-cve-replication
monero_levin_array_memcorrupt Monero · levin-p2p CVE-2018-3972 public-cve-replication
monero_portable_storage_oom Monero · levin-p2p PR#7190 public-cve-replication
sol_tpu_quic_handshake_flood Solana · tpu-quic ND-FD04-LO-01 public-cve-replication
sol_tpu_quic_initial_cpu Solana · tpu-quic ND-FD1-MD-02 public-cve-replication
sol_tpu_quic_slowloris Solana · tpu-quic ND-FD04-IN-02 public-cve-replication
sui_disassemble_panic Sui · json-rpc CertiK Skyfall public-cve-replication
sui_move_recursion Sui · json-rpc CVE-2023-36184 public-cve-replication
sui_verifier_hamsterwheel Sui · json-rpc CertiK Skyfall HamsterWheel public-cve-replication
gossipsub_prune_backoff_overflow libp2p · libp2p-gossipsub CVE-2026-34219 public-cve-replication
gossipsub_subscribe_flood libp2p · libp2p-gossipsub CVE-2026-46679 public-cve-replication
libp2p_signed_peer_record_flood libp2p · libp2p-gossipsub CVE-2023-40583 public-cve-replication
libp2p_stream_exhaustion libp2p · libp2p-gossipsub CVE-2022-23492 public-cve-replication

Distribution: 850 public-cve-replication attack bundles — 40 distinct primitives across 9 chains (Bitcoin, Cosmos, Dogecoin, Ethereum, libp2p, Litecoin, Monero, Solana, Sui) — plus 412 benign. This published cut contains no original bundles; the original RPC-measurement primitives live in the full internal corpus and ship only if the operator explicitly opts in, always under their honest label.

Training procedure (methodology is the contribution)

Per NullRabbit's pre-registration discipline: the corpus is built attack-by-attack from a public disclosure with provenance.public_source; a Cleanlab data-quality scan gates label-issues and duplicates before training; a methodology auditor reviews each gate event with sanity floors and falsification holdouts; honest limitations are stated; cycles — not the final number — are the contribution. This card + model are regenerated automatically from the cut on each training run, so the numbers below always match the shipped model.

Evaluation

Diagnostic ML checks (the corpus of faithfully-modelled public attacks is the deliverable; these are secondary). Reproduced by scripts/known_class_loco_eval.py + scripts/corpus_quality.py.

  • Within-corpus held-out — binary attack-vs-benign ROC-AUC, GroupKFold by primitive: 0.8985. corpus_sha256 known-class-v10-publiccve.
  • Leave-one-chain-out — binary ROC-AUC (HARD zero-shot transfer, not a deployment metric): Cosmos 1.000 / Litecoin 1.000 / Dogecoin 0.997 / Ethereum 0.969 / Bitcoin 0.910 / Sui 0.779 / libp2p 0.762 / Solana 0.641 / Monero 0.631. Chains with few public-CVE primitives have the fewest cross-chain near-neighbours; the companion nr-bundles-public dataset card reports the stricter held-out-chain 7-class family macro-F1 (0.17 Sui / 0.35 Solana vs ~0.14 floor). Reported honestly, not a deployment claim.
  • Leave-one-attack-primitive-out within Bitcoin (leak-clean disjoint-benign): all Bitcoin primitives ≥ 0.997. Detection is on traffic shape, not deep wire-semantics.

Intended uses

Research and benchmarking of network/resource-abuse detection on blockchain infrastructure; a worked, public-provenance reference corpus; downstream training. Not a turnkey production IDS.

Limitations

  • Synthetic lab fidelity — generated localnet traffic, not a real-world deployment claim. A deployment claim needs a real-traffic validation gate (real mainnet RPC + real attack instances).
  • Detection is on traffic shape (volume / rate / size / connection-churn), not deep wire semantics — adequate for these volumetric/crash DoS classes; it would not separate two attacks with identical traffic profiles.
  • No host-load features (root-owned container).
  • This is the public-CVE cut — every shipped attack class reproduces an external public disclosure. The original RPC-amplification measurements (vendor-acknowledged but not CVE-backed) are excluded from this model; they exist in the full internal corpus and ship only on explicit operator opt-in.

How to use

from predict import load, predict
model = load("model.joblib")
out = predict(model, [{"pcap.packet_rate": 850.0, "resp.amp_ratio_max": 224.0}])
# -> [{"scoreable": True, "score": ..., "verdict": "attack"|"benign", "threshold": 0.5}]

Run python inference_example.py for a worked example on real captured vectors.

Licensing

Apache-2.0 (see LICENSE). Attribution appreciated.

Citation

@software{nullrabbit_network_known_class_2026,
  author = {NullRabbit Labs},
  title  = {nr-network-known-class-detector: a public-provenance blockchain network-attack detector},
  year   = {2026},
  url    = {https://huggingface.co/NullRabbit/nr-network-known-class-detector}
}

Related: the open bundle format (nr-bundle-spec), the family taxonomy (mechanism-defined), the earned-autonomy framework (Zenodo 10.5281/zenodo.18406828), the dataset NullRabbit/nr-bundles-public, and nullrabbit.ai.

Contact

NullRabbit Labs — huggingface.co/NullRabbit · nullrabbit.ai

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train NullRabbit/nr-network-known-class-detector