You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

ModelScan bypass: an oversized NumPy .npy header (scan-vs-load divergence) hides an object-array pickle, giving code execution on np.load while ModelScan reports 0 issues

Program: huntr β€” AI/ML Model File Formats (beta) Target (dropdown): NumPy (.npy) Affected tool: protectai/modelscan v0.8.8 (latest on PyPI) β€” modelscan.scanners.NumpyUnsafeOpScan / modelscan/tools/picklescanner.py::scan_numpy Affected NumPy range for the divergence: NumPy 1.24.3 – 2.0.x (the versions modelscan targets; modelscan==0.8.8 pins only numpy>=1.24.3). See "NumPy version note" for >=2.1. Class: Scanner bypass via scan-vs-load parser divergence -> arbitrary code execution (and arbitrary file read) on model load. Severity (honest): High. Maps to the NumPy .npy dropdown tier (up to $1,500). It is a true detection bypass of a CRITICAL-rated payload, but it is not a fully silent "green check" β€” see "Honest severity framing". Status: CONFIRMED β€” re-verified locally on modelscan 0.8.8. Date: 2026-06-15


Summary

ModelScan inspects a NumPy .npy file for malicious pickle operators only after it parses the .npy header and confirms the array dtype is an object dtype (dtype.hasobject). It parses that header with NumPy's default header-size cap of max_header_size = 10000 characters.

NumPy's real loader, np.load(..., allow_pickle=True), parses the same header with max_header_size = 2**64 β€” NumPy deliberately removes the cap on the load path because "the file is by definition trusted when allow_pickle is passed".

This asymmetry is exploitable. If an attacker pads the .npy header past 10000 characters:

  • ModelScan's scan path raises ValueError("Header info length ... is large ...") before it ever checks the dtype, so it never disassembles the trailing pickle. ModelScan marks the file SKIPPED (SCAN_NOT_SUPPORTED), records a non-fatal scanner error, and reports total_issues = 0.
  • The loader path parses the oversized header without complaint, sees the object dtype, and runs pickle.load(fp) -> arbitrary code execution.

The identical payload in a normal-size-header .npy is correctly caught by ModelScan as a CRITICAL os.system operator. Padding the header turns that CRITICAL detection into a 0-issue result, while the payload still fires on load. That divergence between what ModelScan parses and what NumPy loads is the vulnerability.

Root cause (exact locations in ModelScan + NumPy)

ModelScan β€” modelscan/tools/picklescanner.py, scan_numpy():

elif magic == np.lib.format.MAGIC_PREFIX:
    # .npy file
    version = np.lib.format.read_magic(stream)                      # line 230
    np.lib.format._check_version(version)                           # line 231
    _, _, dtype = np.lib.format._read_array_header(stream, version) # line 232  <-- DEFAULT cap = 10000
    if dtype.hasobject:                                             # line 234  <-- never reached for big header
        return scan_pickle_bytes(model, settings, scan_name, True, stream.tell())  # line 235 (the actual pickle scan)
    else:
        return ScanResults([], [], [])

_read_array_header(stream, version) is called without a max_header_size argument, so it uses NumPy's default of 10000. For a header longer than that, this line raises ValueError, so control never reaches the dtype.hasobject check on line 234 and the trailing pickle is never inspected.

NumPy β€” load path uses an unbounded cap. np.load -> numpy.lib.format.read_array(fp, allow_pickle=True) calls the same header reader with max_header_size=2**64. From NumPy's own docs for read_array/load: "max_header_size ... is ignored when allow_pickle is passed, as the file is by definition trusted." Default max_header_size = 10000. So the loader parses what the scanner refuses to.

How the raised ValueError becomes total_issues = 0 β€” modelscan/modelscan.py::_scan_source (lines 173–219): the exception from scan_numpy is caught at line 175, appended to self._errors as a ModelScanScannerError, and the loop continues without marking the file scanned. Because no scanner scanned the file, lines 209–219 then add a SkipCategories.SCAN_NOT_SUPPORTED skip. Net summary: total_issues=0, total_scanned=0, total_skipped=1, errors=1.

So the security control's "did you find anything?" answer is 0, even though a CRITICAL operator is sitting in the file's pickle.

Proof of concept

The PoC builds two .npy files holding the same 1-element object array. The object's __reduce__ returns (os.system, ("<benign marker + /etc/passwd read>",)), so unpickling runs the command. NumPy stores an object array as pickle.dumps(ndarray) immediately after the header, so np.load(allow_pickle=True) unpickles it. The only difference between the two files is 15000 spaces of header padding.

  • baseline_small.npy β€” header 57 chars -> expected: ModelScan DETECTS.
  • attack_bigheader.npy β€” header 15057 chars -> expected: ModelScan MISSES (0 issues), np.load executes.

make_poc.py regenerates both files; reproduce.py runs the full two-part assertion below.

(a) ModelScan β€” baseline detected, attack missed (captured verbatim, modelscan 0.8.8)

[baseline] total_issues=1  total_scanned=1  total_skipped=0  errors=0
        ISSUE: severity=CRITICAL operator=posix.system        # 'nt.system' when scanned on Windows

[attack]   total_issues=0  total_scanned=0  total_skipped=1  errors=1
        ERROR: Header info length (15057) is large and may not be safe to load securely.
        SKIP : SCAN_NOT_SUPPORTED - Model Scan did not scan file

ModelScan flags the unpadded object array as a CRITICAL os.system; the byte-for-byte identical payload with a padded header produces 0 issues.

(b) The NumPy divergence, demonstrated directly (no ModelScan involved)

reproduce.py STEP 0 calls NumPy's own header reader on both code paths:

scan-path default max_header_size : 10000
  normal (57 chars)        scan-path=PARSED                          loader-path=PARSED
  oversized (15057 chars)  scan-path=RAISED ValueError: Header ...   loader-path=PARSED

The scan-time cap (10000) rejects the oversized header; the load-time cap (2**64) accepts it. This is the root divergence, verifiable without ModelScan at all.

(c) The loader executes the payload

np.load(attack_bigheader.npy, allow_pickle=True) ...
np.load returned repr : array([0], dtype=object)
RCE marker written    : True  -> code-exec        # /tmp/modelscan_npy_pwned.txt
arbitrary file read   : True  (... bytes from /etc/passwd)   # /tmp/modelscan_npy_file_read.txt

(Confirmed on the test host with a portable payload: np.load ran os.system, wrote the marker file code-exec, and β€” on Linux β€” copies /etc/passwd to a proof file. The returned array element is the command's exit status, confirming os.system executed during unpickling.)

Two-part assertion satisfied: ModelScan reports total_issues = 0 for attack_bigheader.npy (does not report the CRITICAL operator it catches at normal size), while np.load(..., allow_pickle=True) of that same file executes attacker code and reads a host file.

Impact

Threat model is the standard, ModelScan-endorsed "scan untrusted models before you load them" workflow: a victim downloads a .npy (e.g. from the Hub or a shared artifact store), runs modelscan -p model.npy, sees 0 issues, and then loads it with np.load(path, allow_pickle=True) (or any library that does so internally). Result on the victim host:

  • Arbitrary code execution at load time (the os.system payload β€” replace with any command for a reverse shell, credential theft, persistence).
  • Arbitrary file read / exfiltration as a sub-case (the PoC reads /etc/passwd; the same primitive reads cloud creds, SSH keys, tokens).

This is exactly the impact class ModelScan exists to prevent, and it is the same CRITICAL it correctly flags for the un-padded file β€” achieved here past the scanner.

Honest severity framing (what "bypass" does and does not mean)

I want to be precise for the triager: ModelScan does not print a reassuring green "0 issues β€” file scanned safe". For the attack file it prints total_issues = 0 and marks the file skipped (SCAN_NOT_SUPPORTED) and raises a non-fatal scanner error, and the CLI exits non-zero (exit code 2/3, not 0). So the real-world severity depends on how the consuming pipeline interprets the result:

  • Pipelines that gate on total_issues == 0 (a very common integration β€” "block if any issues, else proceed") treat this as a PASS and go on to np.load -> full RCE. For these, this is a complete detection bypass of a CRITICAL payload.
  • Pipelines that fail-closed on skips/errors / non-zero exit are not bypassed; for them the impact degrades to a denial-of-scan (the malicious file is never assessed). Still a security-relevant gap (a hostile file evades inspection), but not silent RCE.

I am claiming the former, with the explicit caveat above β€” not a silent green check. The fix (below) is the same regardless of which interpretation a given user has.

Honest dup note (nearest prior art and why this is distinct)

Nearest public prior art is CVE-2025-46417 (SecDim / Sorin Boia, 2025) β€” "Bypassing AI Model Scanners and Exfiltrate Sensitive Data". That issue is in picklescan (fixed in picklescan 0.0.25), and its root cause is that the scanner did not follow the pickle embedded inside a NumPy object array at all β€” i.e. a "scanner ignores object-array pickles" coverage gap. The payload is a numpy object array whose pickle does an SSL/DNS exfil.

This finding is distinct on both tool and root cause:

  • Different tool: this is ModelScan, not picklescan. ModelScan does inspect the object-array pickle β€” scan_numpy explicitly checks dtype.hasobject and then calls scan_pickle_bytes (and it correctly catches the un-padded version as CRITICAL, as shown above). So the picklescan "doesn't look inside object arrays" gap does not apply.
  • Different root cause: the bypass here is not "the scanner ignores the pickle". It is a parser/format desync: ModelScan reads the .npy header with max_header_size=10000 while NumPy's loader reads it with 2**64. An oversized header makes the scanner's parse fail before the dtype check, so the (otherwise-working) object-array pickle scan never runs. The exploit primitive is the header field, not the pickle contents.

Generic NumPy/pickle deserialization-RCE (CVE-2019-6446, the allow_pickle lineage) is also not this: that is about whether pickles run, and is well known. This finding is specifically about ModelScan failing to see a payload it is otherwise designed to catch, because of the header-size cap mismatch. I did not find this .npy header-size scan-vs-load divergence in ModelScan documented publicly.

(There is additionally a separate, version-specific defect: on NumPy >=2.1 the private symbols ModelScan imports β€” _check_version, _read_array_header β€” were relocated to numpy.lib._format_impl, so ModelScan's .npy scanner raises AttributeError and cannot scan any .npy at all on a fresh pip install modelscan. That is a different bug; this report is about the header-size divergence on the NumPy versions ModelScan targets. See the note in reproduce.py and README.)

Remediation

Make ModelScan's scan path parse exactly what the loader will parse:

  1. Parse the header with the same (unbounded) cap NumPy's loader uses. In scan_numpy, call np.lib.format._read_array_header(stream, version, max_header_size=2**64) so an oversized header can never prevent the dtype check. The header is being read only to learn the dtype, not loaded as data, so a large header is not itself dangerous here β€” it must not be allowed to skip the pickle scan.
  2. Fail closed on parse errors for known model formats. A .npy whose magic matched but whose header could not be parsed under the default cap should be treated as suspicious / unscannable-but-loadable, not as a benign skip that still yields total_issues = 0. At minimum, any object-dtype .npy that cannot be fully header-parsed should be reported as an issue (or the CLI/integration contract should make "skipped/errored" block by default), so a "scan clean -> load" pipeline cannot proceed.
  3. Defense in depth: if the header (under the safe cap) reports an object dtype, always scan the trailing pickle regardless of header-size anomalies; and document that total_issues == 0 is not a safe gate unless skipped == 0 and errors == 0.

Environment / reproduction

  • protectai/modelscan 0.8.8 (latest), numpy 2.4.6 test host (see NumPy version note: the divergence is native on numpy 1.24.3–2.0.x; on >=2.1 reproduce.py re-binds the two private symbols ModelScan imports so the scanner runs as it does on its target NumPy β€” STEP 0 proves the divergence against NumPy directly, with no shim).
  • Files in this package: report.md, README.md, reproduce.py (one-command end-to-end), make_poc.py (regenerates artifacts), baseline_small.npy, attack_bigheader.npy, scan_report.json (ModelScan JSON report for the attack file).
  • Run: python reproduce.py -> prints the scanner numbers for both files, the NumPy divergence table, and the loader impact, then a VERDICT.

References

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support