You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

modelscan .pt zip-parser differential (Python zipfile vs torch miniz) β€” silent pickle-RCE false-negative

Severity: Critical (modelscan certifies the file CLEAN with no error, while torch.load executes attacker code) Affected tool: modelscan 0.8.8 (and current main) β€” zip walk in modelscan.py _iterate_models / get_zipfile. Victim loader: torch.load(..., weights_only=False) via PyTorchStreamReader/miniz (torch 2.12.0). Category: ModelScan scanner-bypass on PyTorch .pt (in-scope).

Summary

A new-style torch.save() .pt is a ZIP. modelscan walks its members with Python's stdlib zipfile; torch.load reads them with C++ miniz (PyTorchStreamReader). The two readers can be made to resolve the same logical member name (data.pkl) to different physical bytes. Craft a checkpoint where modelscan's zipfile reads a benign data.pkl (β†’ 0 issues, no error, file reported scanned) while miniz reads a malicious data.pkl whose __reduce__ runs os.system β†’ silent RCE on torch.load(weights_only=False). Neither parser crashes β€” this is a silent success-on-different-bytes, not an exception-skip.

Two independent crafting techniques produce the divergence (both verified end-to-end; included as PoC variants):

Variant A β€” concatenated dual-archive (EOCD selection)

poc/variant-concatenated-eocd/ β€” File = zipA || zipB, each an independently-valid torch archive (A's data.pkl malicious, B's benign, byte-lengths/offsets/CRCs patched to stay valid). CPython zipfile locates the central directory from the last EOCD and self-corrects offsets (concat = EOCD_pos βˆ’ cd_size βˆ’ cd_offset) β†’ reads zipB (benign). miniz reads member data at the literal header offset β†’ resolves data.pkl into the leading archive β†’ zipA (malicious). codeLocation: modelscan.py:98-112; CPython zipfile._RealGetContents (last-EOCD); caffe2::serialize::PyTorchStreamReader.

Variant B β€” duplicate central-directory record + EOCD entry-count (miniz honors the count)

poc/variant-dup-centraldir/ — Central directory contains [data.pkl→EVIL], [version], [data.pkl→BENIGN] with the EOCD "total entries" field set to 2 while 3 records physically exist. miniz honors the count and reads only the first 2 records → its only data.pkl is EVIL. CPython zipfile keeps reading all 3 records (count is a hint, not a stop) and last-wins → BENIGN. codeLocation: modelscan/scanners/pickle/scan.py:27-28 (_is_zipfile → return None, so the container is walked by modelscan.py:94-112 stdlib zipfile); torch serialization.py:1949,2182 (get_record('data.pkl') via miniz); CPython zipfile/__init__.py:1482 (NameToInfo[x]=x unconditional last-wins).

Reproduce

  • Variant A: python poc/variant-concatenated-eocd/final_assert.py β†’ zipfile.open('…/data.pkl') head = benign dict; PyTorchFileReader.get_record('data.pkl') head = GLOBAL nt system; modelscan on the polyglot = {CRITICAL:0,…}, errors:[], scanned …halfB\data.pkl; modelscan on halfA alone = CRITICAL:1 (control β€” payload IS detectable); torch.load(weights_only=False) β†’ marker written (RCE).
  • Variant B: python poc/variant-dup-centraldir/poc_dualcd_count_desync.py β†’ mechanism-proof matrix: [EVIL,version,BENIGN] count=2 β†’ torch=EVIL / modelscan=0 issues; inverse count=2 flips both verdicts; honest count=3 β†’ both BENIGN β€” proving the EOCD entry-count is the exact lever. Control: same gadget in a normal .pt β†’ modelscan 1 CRITICAL. torch.load(weights_only=False) β†’ marker written (RCE). File passes torch's strict _is_zipfile.

Both ran clean in mfvenv (modelscan 0.8.8, torch 2.12.0+cpu, Python 3.12.10).

Impact

Complete silent bypass of modelscan's core pickle-RCE detection for PyTorch checkpoints β€” the exact threat modelscan exists to detect, which it flags standalone. modelscan is the de-facto pre-deployment/HF/CI gate; a model it certifies total_issues=0, errors=[] runs arbitrary os.system the moment a victim runs the standard torch.load(weights_only=False). Supply-chain weaponizable on any registry/CI clearing .pt via modelscan. Honest precondition: requires weights_only=False (PyTorch β‰₯2.6 defaults to True, an orthogonal mitigation) β€” but it is pervasive in legacy/training-resume/from_pretrained call sites and is precisely modelscan's deployment scenario. No trust_remote_code.

Dup-check

Novel for modelscan. modelscan has zero published GitHub security advisories. All known zip-bypass CVEs target picklescan (a different codebase) and are crash-based single-archive manipulations: CVE-2025-1944 (local-vs-central name β†’ BadZipFile), CVE-2025-1945 (flag-bit β†’ error), CVE-2025-10156 (bad CRC β†’ abort), GHSA-769v-p64c-89pr (alternate extension hidden pickle). This finding is the opposite of a crash: both parsers succeed, modelscan returns 0 issues with errors=[], silently scanning the wrong physical member. arXiv:2508.19774 "Art of Hide and Seek" Table II ZIP EOPs (double-PK0506, bad-centdir-count, etc.) are all exception-oriented (force the scanner to throw and skip) β€” confirmed by direct fetch; none describe a silent zipfile-last-EOCD vs miniz divergence. ZIP-concatenation/duplicate-entry ambiguity is a known AV/installer evasion class (CPython #117779, uv wheel confusion) but has never been applied to the modelscan-vs-torch-miniz .pt scanner/loader pair. Distinct from our R2 FRAME desync (single pickle stream), R3 .npz inner-member rename (numpy, both use zipfile β†’ no differential), and R5 legacy multi-pickle (non-zip).

Note: the zipfile-vs-miniz differential is specific to .pt/torch β€” .keras/.npz loaders also use Python zipfile, so no differential exists there. This is why it's filed as one .pt finding with two techniques, not replicated across formats.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for EnigmaConsultant/huntr-poc-pt-zip-differential