You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

ExecuTorch `_load_for_executorch` silently downgrades `program_verification` to Minimal → flatbuffer Verifier bypass → controllable out-of-bounds heap read on `.pte` load

Title

ExecuTorch _load_for_executorch[_from_buffer] ignores the advertised program_verification=InternalConsistency default and loads with Minimal verification, bypassing the flatbuffer Verifier and enabling a controllable out-of-bounds heap read from a malicious .pte

Target

Huntr "Repository" / package dropdown: pytorch/executorch (PyPI: executorch)
Affected tool / version: executorch 1.3.1 (current released wheel; verified). Source is identical to pytorch/executorch main.
Affected entry points (public API): executorch.extension.pybindings.portable_lib._load_for_executorch and _load_for_executorch_from_buffer (and the C++ extension::Module load path they wrap).
Format: ExecuTorch program file, .pte (flatbuffer).

Severity

Honest rating: Medium / High-Medium — controllable out-of-bounds heap read + DoS on model load (NOT RCE).

This is a memory-safety bug reachable purely by loading an untrusted model — no execution required. The advertised, opt-out-by-default safety control (InternalConsistency, which runs flatbuffers::Verifier) is silently not applied, so any structurally-corrupt .pte that the Verifier is designed to reject is instead parsed with attacker-controlled vector/string/offset fields.
Demonstrated impact: a controllable OOB heap read (information disclosure of adjacent heap memory, surfaced to the caller as tensor dimensions) and, with large length prefixes, a denial of service (the runtime walks far past the buffer → crash). This is the classic "inflated flatbuffer length prefix" primitive.
Dollar tier (honest): .pte is a flatbuffer model format in the pickle/.npy/.h5/.tflite/.pte family → ~$1.5k tier, not one of the named $4k formats (.joblib/.keras/.gguf/.safetensors/TF-SavedModel). The impact is OOB-read/DoS, not RCE, which keeps it below the file-write/RCE tier. I am explicitly not claiming write or code execution.

Summary

The Python stubs and the C++ PyModule constructors advertise program_verification = Verification.InternalConsistency as the default for _load_for_executorch[_from_buffer]. InternalConsistency is the only level that runs flatbuffers::Verifier + validate_program() over the untrusted .pte before any field is dereferenced. In reality the verification argument is dropped on the floor: the pybindings helper load_module_from_buffer()/load_module_from_file() constructs the Module without forwarding program_verification, and the program is then parsed lazily by Module::load() whose default is the much weaker Program::Verification::Minimal. Minimal only bounds-checks the root-table offset and trusts every vector length prefix, string length, and vtable offset in the flatbuffer. As a result, a malicious .pte whose sizes vector length prefix is inflated (claims 0x7FFFFFFF entries while the buffer holds 2) is accepted, and reading the tensor metadata walks out of bounds — a controllable heap over-read / DoS, exactly the corruption class the Verifier exists to stop.

Root cause (file:line)

All paths below are in executorch 1.3.1, identical to pytorch/executorch@main.

The advertised default is InternalConsistency — extension/pybindings/pybindings.pyi:
```
def _load_for_executorch(..., program_verification: Verification = Verification.InternalConsistency) -> ExecuTorchModule: ...
def _load_for_executorch_from_buffer(..., program_verification: Verification = Verification.InternalConsistency) -> ExecuTorchModule: ...
```
The C++ PyModule constructors and pybind defaults agree: pybindings.cpp lines 566/589/606/621/652/667 default program_verification = Program::Verification::InternalConsistency, and the bound functions register py::arg("program_verification") = ...InternalConsistency (pybindings.cpp:1544, 1554).

The verification argument is then dropped — extension/pybindings/pybindings.cpp, load_module_from_buffer() (lines 182–209) and load_module_from_file() (lines 211–234). Both have the signature parameter Program::Verification program_verification (lines 188, 215) but construct the Module without passing it:

inline std::unique_ptr<Module> load_module_from_buffer(
    const void* ptr, size_t ptr_len, ...,
    Program::Verification program_verification) {        // <-- received
  auto loader = loader_from_buffer(ptr, ptr_len);
  ...
  return std::make_unique<Module>(
      std::move(loader), nullptr, nullptr,
      std::move(event_tracer), nullptr);                 // <-- program_verification NOT forwarded
}

program_verification is unused after this point — there is no Module::load(verification) call wired to it.

Lazy parse uses the weak Minimal default — extension/module/module.h declares both Module::load overloads with const Program::Verification verification = Program::Verification::Minimal (module.h:188–190 and 212–215). Every lazy trigger calls load() with no argument: Module::method_names() → load() (module.cpp:320), Module::num_methods() → load() (module.cpp:315), Module::method_meta() → load() (module.cpp:554), Module::load_method() → load() (module.cpp:459). So Program::load(loader, Minimal) is what actually runs (module.cpp:301).
Minimal skips the Verifier — runtime/executor/program.cpp, Program::load() lines 176–224. The flatbuffers::Verifier + validate_program() run only under InternalConsistency (lines 177–202). Under Minimal the code merely range-checks the root-table offset (lines 204–224) and trusts the rest of the flatbuffer. Inflated vector length prefixes are therefore never caught.

Net effect: calling _load_for_executorch_from_buffer(bad, program_verification=Verification.InternalConsistency) — the documented-safe default — parses bad with Minimal verification. The safety control the API advertises is unreachable through the Module/pybindings path.

Proof of concept

Build + assertion logic

Both a valid and a malicious .pte are built in-process with executorch's own exir serializer (the same serialization the official export pipeline uses), so the bytes are genuine ExecuTorch programs.

Baseline program: one method forward whose single output Tensor has a sizes flatbuffer vector of [1, 1] (constant_segment path, accepted by the released wheel).
Corruption: locate the sizes vector's 32-bit length prefix (byte pattern <u32 len=2><i32 1><i32 1>) and overwrite the length with a larger value. The buffer still physically holds only 2 elements.
Oracle: _load_program_from_buffer (its PyProgram ctor calls Program::load(loader, verification) eagerly with the requested mode) tells us, for the SAME bytes, whether the Verifier accepts. We then show _load_for_executorch_from_buffer(..., InternalConsistency) behaves like Minimal — the bypass — and read the corrupted metadata back through the public MethodMeta.output_tensor_meta(0).sizes().

Captured output (executorch 1.3.1, CPython 3.12)

Running the portable reviewer script reproduce.py (self-contained; builds the .pte in-process):

[env] wrote+read benign marker at <tmp>/executorch_pte_poc_marker.txt
[env] executorch wheel: \executorch\exir\_serialize\_program.py

== (A) Verifier oracle (_load_program_from_buffer), SAME malicious bytes ==
   corrupted `sizes` length prefix @offset 276: 2 -> 2147483647 (buffer physically holds 2 elements)
   valid + InternalConsistency : ACCEPTED (num_methods=1)
   BAD   + InternalConsistency : REJECTED (Failed to load program, error: 0x:23)   <- flatbuffer Verifier catches it
   BAD   + Minimal             : ACCEPTED (num_methods=1)   <- no Verifier, accepted

== (B) THE BYPASS: _load_for_executorch_from_buffer (advertised default = InternalConsistency) ==
   BAD + InternalConsistency : ACCEPTED+PARSED (methods=['forward'])
   BAD + Minimal             : ACCEPTED+PARSED (methods=['forward'])
   (identical => requested InternalConsistency verification was NOT applied)

== (C) Concrete controllable OOB heap read (MethodMeta.output_tensor_meta.sizes) ==
   claimed dims= 3: sizes()=(1, 1, 786440)  (first 2 real; the rest are heap bytes past the vector)
   claimed dims= 5: sizes()=(1, 1, 786440, 262152, 8)  (first 2 real; the rest are heap bytes past the vector)
   claimed dims= 8: sizes()=(1, 1, 786440, 262152, 8, 8, 12, 0)  (first 2 real; the rest are heap bytes past the vector)
   claimed dims=16: sizes()=(1, 1, 786440, 262152, 8, 8, 12, 0, 0, 0, 0, 7, 2003988326, 6582881, 0, 0)  (first 2 real; the rest are heap bytes past the vector)

=================== VERDICT ===================
  Verifier detects corruption under InternalConsistency : YES
  Minimal accepts the same corruption                   : YES
  _load_for_executorch(InternalConsistency) bypasses it : YES
  Out-of-bounds tensor dims surfaced to caller          : YES
  >>> program_verification BYPASS + OOB READ CONFIRMED  : True

Interpretation:

(A) proves the Verifier genuinely catches this corruption: identical malicious bytes are REJECTED under InternalConsistency (error 0x23 = InvalidProgram, raised at program.cpp:187 "Verification failed; data may be truncated or corrupt") but ACCEPTED under Minimal.
(B) is the vulnerability: _load_for_executorch_from_buffer with the advertised InternalConsistency default produces the same result as Minimal — the requested verification was not applied.
(C) is the concrete impact: MethodMeta.output_tensor_meta(0).sizes() returns attacker-controlled dimensions read past the end of the real 2-element flatbuffer vector. Only the first two values (1, 1) are real; the trailing values (786440, 262152, …) are adjacent heap bytes. Increasing the claimed length walks further out of bounds — an information leak that, at large lengths, becomes an out-of-range access / crash (DoS). The leaked dwords are reproducible across runs, confirming a genuine over-read rather than randomness.

reproduce.py exits 0 only when all four verdict conditions hold.

Files in this package

reproduce.py — single self-contained reviewer script (builds the .pte in-process, runs A/B/C, prints the verdict, exits non-zero on failure). Portable: benign marker written/read under the OS temp dir.
poc_final.py + build_lib.py — the original three-part PoC (same logic; build_lib.py wires the in-wheel flatc and exir serializer).
crash.pte, marker.pte — concrete pre-built malicious sample programs (inflated/poisoned sizes vector) for quick manual loading.

Impact (realistic threat model)

ExecuTorch is the on-device PyTorch runtime; .pte files are distributed to and loaded by mobile/edge apps and by any server-side code that ingests user- or third-party-supplied models. _load_for_executorch[_from_buffer] is the canonical, documented loading API, and InternalConsistency is advertised as its default precisely so that loading an untrusted .pte is safe against malformed/corrupt files. Because that verification is silently downgraded to Minimal:

An attacker who can get a victim to load a crafted .pte (malicious model on a hub, supply-chain swap, MITM of a model download, a model received from another user) triggers, at load time and with no method execution, a controllable out-of-bounds heap read. Concretely this yields:
- Information disclosure: adjacent heap memory is surfaced through tensor metadata (sizes()/nbytes()), and downstream allocation/copy logic sized from those attacker-controlled dims can read/move further out-of-bounds.
- Denial of service: large/odd length prefixes and poisoned offsets make the runtime dereference far outside the buffer → crash of the host process.
The bug defeats the specific mitigation the API documents. A developer who reads the stub/signature and relies on the default InternalConsistency to harden untrusted-model loading gets no protection at all through this path. That gap between documented and actual behavior is the core of the report.

I deliberately did not develop this into write/RCE; flatbuffer over-reads of this kind are routinely accepted as memory-corruption findings on their own, and over-claiming would be dishonest.

Honest duplicate + scope note

Scope: the vulnerable code is in executorch itself (pytorch/executorch) — the extension/pybindings + extension/module + runtime/executor load path. It is not a finding in a scanner. (For the threat model I checked modelaudit's ExecuTorchScanner: it performs only static checks on .pte — magic/signature validation, zip path-traversal, embedded .pkl/.py detection — and never invokes the executorch loader, so it neither triggers nor mitigates this bug. No scanner-scope ambiguity.) Submit against the pytorch/executorch package.
Duplicate check: I am not aware of a public advisory/CVE/huntr report for "_load_for_executorch drops program_verification / loads with Minimal by default." The behavior is reproduced here against the released 1.3.1 wheel from first principles. The maintainers should still be allowed to dedup against any internal tracking. The two facts that make this novel and not "working as intended": (1) the public stub/pyi and the PyModule constructors explicitly advertise InternalConsistency as the default, and (2) the same bytes are provably rejected when that verification actually runs (oracle path A) — so this is a real control that is silently unreachable via the Module path, not a documented limitation.

Remediation

Forward the argument. In load_module_from_buffer()/load_module_from_file()/load_module_from_buffer_with_data_file() (pybindings.cpp), thread program_verification into the load: after constructing the Module, call module->load(program_verification) (and load_method/method_names thereafter), or add a Module constructor/factory that stores the requested verification and uses it for the lazy load(). The value is currently accepted and discarded.
Make the safe level the real default. Change the default of Module::load(...) (module.h:188–190, 212–215) from Program::Verification::Minimal to Program::Verification::InternalConsistency, so the lazy parse triggered by method_names()/method_meta()/load_method() verifies by default. Callers that truly want to skip verification can opt in to Minimal explicitly.
Defense in depth. Even under Minimal, bound vector/string length prefixes against the remaining buffer before iterating tensor metadata (e.g. in MethodMeta/TensorInfo accessors), so a corrupt length cannot translate into an over-read regardless of verification level.
Docs. Until fixed, the .pyi/docstrings should not advertise InternalConsistency as the effective default for _load_for_executorch[_from_buffer], since the Module path does not honor it.

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support