YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Load-time out-of-bounds read in PyTorch Mobile / Lite Interpreter .ptl via unchecked operator index in applyUpgrader()
Target (huntr dropdown): pytorch/pytorch
Affected component: PyTorch Lite Interpreter (mobile bytecode) β torch._C._load_for_lite_interpreter / torch.jit.mobile._load_for_lite_interpreter
Affected/tested version: torch 2.12.0+cpu (git 7661cd9c6b841b62b7f411aa52ec51f05457263b); code path unchanged on current main.
File format: PyTorch Mobile .ptl (Lite Interpreter), a ZIP-of-pickles container β .pte-family target.
Vulnerability class: Out-of-bounds read (CWE-125) β denial of service; potential heap info-leak.
Severity
Rating: Medium β load-time Denial of Service (memory-safety / OOB read).
Dollar tier (honest): .ptl is the PyTorch Mobile / Lite-Interpreter container, i.e. the .pte / mobile family β $1.5k tier, not a named $4k format (.joblib / .keras / .gguf / .safetensors / TF-SavedModel).
Reasoning (CVSS-style): Attack vector Network/file (an attacker-supplied .ptl); attack complexity Low; privileges None; user interaction Required (victim loads the model). The demonstrated impact is a reliable native crash (SIGSEGV / 0xC0000005 ACCESS_VIOLATION) at load time, before any inference, on any process that calls _load_for_lite_interpreter on an untrusted file. Because the wild index dereferences a c10::OperatorName and reads its .name std::string, an out-of-bounds heap read also occurs; under a heap layout where the bytes at op_names_ + X*sizeof(OperatorName) happen to be mapped, the adjacent heap could be surfaced as an "operator name" string (info-leak). This PoC demonstrates the crash deterministically and does not claim a working info-leak. I am not rating this RCE β there is no control-flow hijack or write primitive here.
Honest CVSS ~6.5 (AV:N/AC:L/PR:N/UI:R/S:U/C:L/I:N/A:H) β availability-led.
Summary
applyUpgrader() in PyTorch's mobile bytecode deserializer indexes the per-method operator table code.op_names_[inst.X] for every OP instruction without bounds-checking inst.X. This pass runs at load time (inside _load_for_lite_interpreter, before forward()) whenever the model's file version record is < 0xA β which is true of essentially every real-world .ptl (normal version is 3). A crafted .ptl that declares a one-entry operators table but an OP instruction with a large index (e.g. 0x7F000000) makes the loader dereference memory roughly gigabytes past the table, causing an out-of-bounds read and a hard crash before any model logic runs. The malicious file carries no dangerous pickle opcodes, so picklescan / modelscan report it as clean.
Root cause
torch/csrc/jit/mobile/parse_bytecode.cpp, function applyUpgrader() (lines 69β114; the unchecked dereference is line 75):
void applyUpgrader(mobile::Function* function, uint64_t operator_version) {
Code& code = function->get_code();
auto& operator_version_map = getOperatorVersionMapForMobile();
for (size_t i = 0; i < code.instructions_.size(); i++) {
Instruction& inst = code.instructions_[i];
if (inst.op == OpCode::OP) {
std::string operator_name = code.op_names_[inst.X].name + // <-- line 75: NO bounds check on inst.X
(code.op_names_[inst.X].overload_name.empty()
? ""
: "." + code.op_names_[inst.X].overload_name);
...
inst.X is an attacker-controlled int taken verbatim from the bytecode pickle (parseInstructions() β function->append_instruction(op_code, X, N) with X = ins_item[1].toInt(), parse_bytecode.cpp:156,161-163). code.op_names_ is populated from the model's operators table (parse_operators.cpp); its size is fully attacker-chosen and independent of inst.X. There is no validation that inst.X < op_names_.size() before the indexing on line 75.
Asymmetry that makes this a real, reachable bug β the runtime path is guarded but the load-time path is not. The interpreter's OP handler does bounds-check, in torch/csrc/jit/mobile/interpreter.cpp (lines 128β132):
case OP: {
...
if (inst.X < 0 ||
static_cast<size_t>(inst.X) >= code.op_names_.size() ||
static_cast<size_t>(inst.X) >= code.operators_.size()) {
TORCH_CHECK(false, "Can't load op with index: ", inst.X); // runtime is safe
}
...
}
So a malformed OP index can never reach the guarded runtime path for an out-of-range value via normal execution β but applyUpgrader() walks the same instructions at load time and dereferences op_names_[inst.X] with no such guard. The upgrader is invoked from BytecodeDeserializer::parseMethods() (mobile import.cpp) for every method when use_upgrader == (operator_version_ < caffe2::serialize::kProducedFileFormatVersion /* 0xA */). operator_version_ is the integer in the container's version record, which the producer sets to 3 for normal mobile models, so the vulnerable pass runs on essentially every .ptl during a normal _load_for_lite_interpreter.
Proof of concept
A self-contained reviewer script reproduce.py is attached. It builds a benign baseline .ptl (torch.jit.script of x + 1.0, saved via _save_for_lite_interpreter), then clones it and replaces only bytecode.pkl so that:
- the
operatorstable has exactly one entry (op_names_.size() == 1), and - the single
OPinstruction hasX = 0x7F000000(fitsint32;X * sizeof(c10::OperatorName)lands ~gigabytes away β unmapped page).
It keeps the file version record at 3 so use_upgrader == true. It then (1) confirms the attack carrier bytecode.pkl contains no dangerous pickle opcodes and that modelscan's actual unsafe-global allowlist flags 0 globals across the whole .ptl; (2) loads each variant in a fresh child process (a load-time crash takes the whole interpreter down) calling only torch._C._load_for_lite_interpreter β never forward(); and (3) runs two controls that isolate the cause.
Build assertion (what the script checks)
[PASS] scanner-clean (bytecode.pkl no danger + modelscan 0 flags)
[PASS] benign .ptl loads clean
[PASS] malicious .ptl crashes loader natively
[PASS] control valid-index does NOT crash (version=3, OP index 0)
[PASS] control upgrader-off does NOT crash (version=10, OP index 0x7F000000)
PoC SUCCESS (scanner-clean + load-time native OOB crash): True
Captured output (torch 2.12.0+cpu, Windows; reproduce.py)
torch: 2.12.0+cpu | work dir: ...\ptl_oob_g7ywbo2q
breadcrumb marker: .../PTL_OOB_REACHED.txt
documented leak target (NOT read here): .../ADJACENT_SECRET.txt
[1] malicious vs benign .ptl differ ONLY in bytecode.pkl
SAME base/data.pkl
SAME base/code/__torch__.py
SAME base/code/__torch__.py.debug_pkl
SAME base/constants.pkl
DIFFERENT base/bytecode.pkl
SAME base/version
SAME base/byteorder
SAME base/.data/serialization_id
[2] scanner blindness
dangerous opcodes in the attack carrier bytecode.pkl: [] (none)
modelscan-allowlist flagged globals (whole .ptl): [] (none)
>>> picklescan/modelscan verdict => CLEAN (0 issues): True
[3] load each .ptl in a fresh child (no forward() ever called)
benign base.ptl -> rc= 0 LOADS-CLEAN
malicious OOB .ptl -> rc=3221225477 NATIVE-CRASH 0xC0000005 (ACCESS_VIOLATION)
control valid-index -> rc= 0 LOADS-CLEAN
control upgrader-off -> rc= 0 LOADS-CLEAN
PoC SUCCESS (scanner-clean + load-time native OOB crash): True
3221225477 == 0xC0000005 is the Windows ACCESS_VIOLATION code (the SIGSEGV equivalent). On Linux the child is killed by SIGSEGV (exit 139 / signal 11); reproduce.py classifies both.
Mechanism / controls (A/B/C) β proves it is the upgrader pass, not parsing
The two controls in reproduce.py (and the standalone mechanism.py in the research dir) discriminate the exact trigger:
A version=3 + OP OOB -> NATIVE-CRASH 0xC0000005 (use_upgrader == TRUE, index out of range)
B version=10 + OP OOB -> LOADS-CLEAN (use_upgrader == FALSE: upgrader skipped)
C version=3 + OP idx0 -> LOADS-CLEAN (use_upgrader == TRUE, valid index)
- A vs B holds the OOB index constant and flips only the file
version(which togglesuse_upgrader). The crash disappears when the upgrader pass is skipped β the crash is inapplyUpgrader(), not in bytecode parsing or interpreter setup. - A vs C holds
versionconstant and flips only the operator index. A low version with a valid index loads cleanly β a low version alone is harmless; the out-of-range index is required.
Together these isolate the fault to the unchecked code.op_names_[inst.X] on parse_bytecode.cpp:75.
Impact (realistic threat model)
Any application or service that loads an untrusted .ptl with the Lite Interpreter is exposed to an unauthenticated, pre-inference crash:
- on-device / mobile inference apps that accept user- or server-supplied lite models;
- model-hosting / conversion / validation backends that call
_load_for_lite_interpreteron uploaded artifacts (e.g. to inspect or re-serve them); - CI / model-zoo ingestion that opens third-party
.ptlfiles.
The crash occurs at load time, before forward(), so "don't run untrusted inference" is not a mitigation β merely opening the file is sufficient. A single ~2 KB file reliably takes down the hosting process (DoS); a fleet that auto-loads submitted models can be crashed at will. Because the faulting access is a read of op_names_[inst.X].name, the same primitive is, in principle, an out-of-bounds heap read: with a smaller in-range-but-still-OOB index whose target page is mapped, the loader would build an "operator name" from adjacent heap bytes, which could be surfaced through error messages or upgrader behaviour (info-leak). This PoC demonstrates only the deterministic crash and does not weaponise the read into a confirmed leak. No write primitive and no code execution are claimed.
Honest duplicate / prior-art note
- This is a memory-safety bug in PyTorch's own mobile loader, in the same family as other unchecked-index issues in TorchScript/mobile deserialization that have been reported over time. I did not find a public advisory or fixed PR specifically for the
applyUpgrader()operator-index read at load time, but PyTorch generally treats "loading an untrusted/malformed model can crash the process" as expected behaviour and documents that models should only be loaded from trusted sources. A triager may therefore close this as informative / by-design rather than a rewardable vulnerability. I am presenting it as a concrete, reproducible OOB read with a clean root cause and controls; the novelty claim is limited to this specific unchecked load-time path, not to "untrusted models are dangerous" in general. - Even if accepted, severity should stay at the DoS / OOB-read level above unless the heap info-leak is actually demonstrated.
Scope note (important)
The vulnerability is in PyTorch (pytorch/pytorch) β that is the correct huntr target. The picklescan / modelscan angle in the PoC is only a detection-gap observation: it shows the malicious .ptl sails past the standard pickle scanner (no GLOBAL/REDUCE/STACK_GLOBAL; the only global anywhere is the benign __torch__.M TorchScript module ref in data.pkl, which is not on any unsafe-global list). It is not a claim of a bug in modelscan/picklescan, and this report should not be filed against those tools. If a reviewer prefers to frame the scanner-blindness as a modelscan scope issue, that would be a separate, weaker report; the substantive finding here is the PyTorch load-time OOB read.
Remediation
Bounds-check the operator index in applyUpgrader() before dereferencing, mirroring the guard already present in the interpreter's OP/OPN handlers. Minimal fix in torch/csrc/jit/mobile/parse_bytecode.cpp:
if (inst.op == OpCode::OP) {
TORCH_CHECK(
inst.X >= 0 && static_cast<size_t>(inst.X) < code.op_names_.size(),
"Malformed model: OP instruction operator index ", inst.X,
" is out of range for operator table of size ", code.op_names_.size());
std::string operator_name = code.op_names_[inst.X].name + ...
Defence in depth: validate every instruction's index/operand ranges against the corresponding table sizes (op_names_, operators_, constants_, types_) once at the end of parseInstructions() / during method finalisation, so all load-time consumers (upgrader and any future passes) are protected, not just the runtime interpreter. Consider also rejecting models whose declared file version is implausibly low for a non-empty operator set, and fuzzing the lite-interpreter deserializer (_load_for_lite_interpreter) on malformed bytecode tables.
Reproduction (exact)
# any environment with a recent torch (Lite Interpreter present), e.g. torch 2.12.0
python reproduce.py
# exit 0 and "PoC SUCCESS ... : True" == scanner-clean + load-time native OOB crash confirmed
Attached artifacts:
reproduce.pyβ single-file reviewer script (builds models, asserts scanner-clean + native crash + controls; portable Linux/tmpmarker, falls back to a temp dir on Windows).malicious.ptlβ prebuilt crafted model (version=3, one operator,OPindex0x7F000000).base.ptlβ benign baseline that loads cleanly through the identical path.variant_A_low_oob.ptl/variant_B_v10_oob.ptl/variant_C_low_valid.ptlβ the A/B/C mechanism variants.upstream_parse_bytecode.cpp/upstream_interpreter.cppβ the upstream source showing the unchecked dereference (parse_bytecode.cpp:75) vs the guarded runtime path (interpreter.cpp:128-132).