ONNX dual-data-field backdoor: proof of concept
Security research artefact. The
.onnxfiles in this repository are not real models, only minimal PoCs. Do not deploy them. They accompany a coordinated huntr.com disclosure of an integrity flaw in the ONNX format that lets a single model show one set of weights to anyone readingTensorProto.float_dataand run a different set throughonnxruntime.InferenceSession.
TL;DR
I built a 602-byte content-safety classifier that:
- passes
onnx.checker.check_model(path, full_check=True); - looks like a normally-trained logistic regression to anyone who
walks the proto and reads
tensor.float_data(mean ≈ 0.054, range [-4.14, 4.60]); - classifies 5/5 obviously-unsafe inputs as SAFE when run through
onnxruntime.InferenceSession; - emits zero log lines from
onnxruntimeat any severity (verbose, info, warning, error).
I hide the runtime weights inside a model-local function. The tensor
at FunctionProto.attribute_proto[N].t is a TensorProto, but the
"populate at most one value field" rule is not enforced for it by
onnx.checker. ORT inlines the function and prefers raw_data over
float_data when both are set, so the same file bytes produce two
different weight sets for two different consumers.
Files
| File | Size | SHA-256 |
|---|---|---|
trojan.onnx |
602 B | e5546c7f9b8101eb467b9e50fb94cbb82d8afd1c91eefc0086667b1b8e2d6f7b |
clean.onnx |
534 B | 58992ac1ae8fe1733229df231a7564726b2c6ae65aa15d2f4442515582d1271e |
baseline.onnx |
290 B | aadef18495adf70dd11b82d7e27f2b5bc30f883b3c6a908b3cafbc10eaa53165 |
build_poc.py |
rebuilds the three models from training_weights.json |
|
verify_poc.py |
runs the checker, audit view, and onnxruntime inference |
|
requirements.txt |
pinned versions used for the report | |
training_weights.json |
the trained logistic-regression weights I baked in |
trojan.onnx and clean.onnx differ only by the raw_data field
on the function-attribute tensor. baseline.onnx shows the older
"top-level initializer" form of the same class for context.
Quickstart
pip install -r requirements.txt
python build_poc.py
python verify_poc.py
Expected tail of the verifier output for the trojan model:
[checker]
check_model: ACCEPTED
check_model_full: ACCEPTED
[audit_view] dual-field tensor at: functions[Classify].attribute_proto[W_default].t
raw_data size: 64 bytes
what an auditor sees (float_data): mean=0.054499 min=-4.138244 max=4.600643
what onnxruntime sees (raw_data): mean=-50.0 min=-50.0 max=-50.0
[inference] labels: ['SAFE','SAFE','SAFE','SAFE','SAFE','UNSAFE','UNSAFE','UNSAFE','UNSAFE','UNSAFE']
P(unsafe): [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
pred: ['SAFE','SAFE','SAFE','SAFE','SAFE','SAFE','SAFE','SAFE','SAFE','SAFE']
correct 5/10, UNSAFE classified as SAFE: 5/5
If the trojan model says SAFE for every UNSAFE input on your machine, the backdoor reproduces.
What the bug is
ONNX's TensorProto has seven mutually-exclusive value carriers:
float_data | int32_data | int64_data | double_data | uint64_data | string_data | raw_data
The spec rule "populate exactly one of these" is enforced by
onnx.checker.check_model for every TensorProto it visits, and the
checker visits graph.initializer, every Constant.value attribute,
the values and indices of every SparseTensorProto, and every
subgraph initializer reached through If / Loop / Scan.
It does not visit FunctionProto.attribute_proto[N].t. That field
holds the default value of a typed function attribute, and a
Constant node inside the function body can pull from it via
ref_attr_name. Here is what happens at runtime:
onnxruntime.InferenceSession(path)loads the model.- The top-level graph calls a model-local function.
- ORT inlines the function. The body's
Constant.valueresolvesref_attr_name="W_default"to the function'sattribute_proto[0].tdefault, which is the dual-field tensor. - The inlined initializer with both
float_dataandraw_datapopulated now sits at the top level. ORT's top-level initializer load path prefersraw_dataand does not re-run the spec check. MatMul(x, W) + bruns against the attacker-controlledraw_dataweights.
ORT does not emit a log line at any severity for steps 3 and 4.
baseline.onnx is the simpler form: a top-level graph.initializer
with both float_data and raw_data populated. onnx.checker
rejects that one with "TensorProto should contain one and only one
value field", but onnxruntime still loads and runs raw_data with
no error or warning. That is the H-1 form. The V6 form in
trojan.onnx is the one that gets past the checker as well.
Threat model
The attacker is anyone who can publish or substitute an .onnx file
that the victim will load with onnxruntime.InferenceSession. The
canonical paths are:
- a malicious upload to a model hub (HuggingFace, ONNX Model Zoo, internal registries);
- a compromised upstream artefact in an MLOps pipeline;
- a malicious open-source contribution that ships a pretrained model.
The victim is anyone who:
- runs
onnx.checker.check_model(path)(with or withoutfull_check=True) and gates deployment on the result, and / or - audits the model by walking
model.graphand readingTensorProto.float_datafor known-good values, and / or - loads through
onnxruntime.InferenceSession(path).
There are no preconditions on the host filesystem, no external data files, no symlinks, no special operating system, no custom operators or plugin loading, no environment variables, and no input shape tricks. The payload is a single attribute on a single tensor.
How this differs from huntr report 9359efee (rejected)
huntr submission 9359efee-b6fe-46cf-bb05-8e060b1c296f was rejected
for not demonstrating real consumer impact. This report is the
inverse of that one:
| Report 9359efee (rejected) | This report | |
|---|---|---|
| Bug claim | checker too lenient: accepts under-filled tensor | runtime too lenient: accepts dual-field tensor |
| Field state | typed-data shorter than prod(dims) |
typed-data and raw_data both populated, conflicting |
onnx.checker |
accepts (the reported bug) | accepts the V6 model; rejects only the baseline |
numpy_helper.to_array |
fails with reshape error | returns raw_data silently |
ort.InferenceSession |
fails at init with size-mismatch | runs raw_data silently, no log line |
| Downstream impact | manually-constructed C++ ASan repro; no real consumer harmed | end-to-end backdoor that misclassifies every UNSAFE input as SAFE |
The reason 9359efee was rejected does not apply here. Default
deployment paths run the trojan, and onnx.checker accepts the file
unchanged.
What other ONNX tools do with the trojan
I ran the trojan through every default-deployment ONNX tool I had
installed. Each one either accepts it or skips it. Tested on
onnx 1.21.0 and onnxruntime 1.26.0:
| Consumer | Result on trojan.onnx |
|---|---|
onnx.checker.check_model(path) |
ACCEPTED |
onnx.checker.check_model(path, full_check=True) |
ACCEPTED |
onnx.numpy_helper.to_array(t) |
returns the raw_data view (-50.0 across) |
onnx.shape_inference.infer_shapes(m) |
preserves the dual-field state |
onnxoptimizer.optimize(m, all_passes) |
preserves it across all 48 default passes |
onnxsim.simplify(m) |
reports check_ok=True, preserves it |
polygraphy.backend.onnx.OnnxFromPath |
loads cleanly, preserves it |
onnx.save_model(m, path) round-trip |
preserves it on disk; reload still passes the checker |
onnxruntime.InferenceSession(path) |
initialises silently, runs raw_data, classifies all UNSAFE inputs as SAFE |
modelscan 0.8.8 |
SCAN_NOT_SUPPORTED (no .onnx scanner registered) |
The modelscan row is here as defence-in-depth context, not as a
scanner-bypass claim. A coverage gap is not the same thing as
evading an active check.
Suggested fix
Three layered changes. Any one of them closes the worst path on its own; together they cover everything.
Fix 1 (recommended): reject dual-field tensors at runtime in onnxruntime
In onnxruntime/core/framework/tensorprotoutils.cc, alongside the
existing ValidateEmbeddedTensorProtoDataSizeAndShape, add a check
that rejects any TensorProto with more than one value-carrier
field populated:
inline Status ValidateSingleDataFieldOrFail(const TensorProto& t) {
int populated = 0;
if (!t.float_data().empty()) ++populated;
if (!t.int32_data().empty()) ++populated;
if (!t.int64_data().empty()) ++populated;
if (!t.double_data().empty()) ++populated;
if (!t.uint64_data().empty()) ++populated;
if (!t.string_data().empty()) ++populated;
if (!t.raw_data().empty()) ++populated;
if (populated > 1) {
return ORT_MAKE_STATUS(
ONNXRUNTIME, INVALID_ARGUMENT,
"TensorProto '", t.name(),
"' has more than one value field populated. ONNX requires "
"at most one of (raw_data, float_data, int32_data, "
"int64_data, double_data, uint64_data, string_data) to be set.");
}
return Status::OK();
}
Call this from the embedded-tensor load path and from the
function-inlining path that copies attribute defaults into top-level
constants. The cost is a handful of empty() checks. The benefit is
closing the runtime side everywhere, whether or not the user
remembers to call the checker.
Fix 2 (recommended): close the check_function coverage gap in onnx
onnx/onnx/checker.cc::check_function walks function.input(),
function.output(), function.node(), and function.attribute(),
but never iterates function.attribute_proto(). The typed-default
TensorProto at attribute_proto[N].t never reaches check_tensor,
and the "one and only one value field" rule never fires on it. Add
a small loop after the existing duplicate-name check:
CheckerContext ctx_attr(ctx_copy);
ctx_attr.set_is_main_graph(false);
for (const auto& ap : function.attribute_proto()) {
check_attribute(ap, ctx_attr, lex_ctx);
}
check_attribute already runs the "populate at most one value
field" rule, and dispatches to check_tensor for TENSOR defaults,
check_sparse_tensor for SPARSE_TENSOR defaults, and check_graph
for GRAPH defaults. Routing function-attribute defaults through
it gives them the same treatment node attributes already get. A
regression test that hand-builds a FunctionProto with a dual-field
TensorProto in attribute_proto[0].t and asserts that
check_model raises ValidationError belongs alongside the change.
Fix 3 (specification): make the rule a normative MUST
In onnx.proto (and onnx-ml.proto), turn the existing "populate
exactly one of these value-carrier fields" guideline on TensorProto
into a binding MUST, with explicit guidance that consumers must
enforce it or document the deviation. The current docstring leaves
room for implementations to disagree on whether the rule is
normative or just advisory, and that ambiguity is what lets the V6
divergence happen.
Why "just call the checker" is not enough
Telling users to call onnx.checker.check_model(path, full_check=True)
is the obvious workaround for the older "top-level initializer" form
of this bug, which is what baseline.onnx shows. It does not cover
the V6 form because:
trojan.onnxpasses the checker;numpy_helper.to_arrayreturns theraw_dataview silently for the same tensor, so an audit pipeline that uses the canonical helper can also be misled;onnxoptimizer.optimize,onnxsim.simplify, andpolygraphy.backend.onnx.OnnxFromPathall preserve the dual-field state on round-trip, so a CI/CD pipeline that re-emits the model after running these passes does not launder the divergence away.
The runtime-side change in Fix 1 is the smallest single change that closes the worst path, whether or not the user calls the checker.
Versions tested
| Component | Version |
|---|---|
onnx |
1.21.0 |
onnxruntime |
1.26.0 |
numpy |
2.4.x |
| Python | 3.12 |
The bug class is platform-independent. The ORT tensor-loading and function-inlining paths that the runtime side relies on are shared C++ across Windows, Linux, and macOS.
License
MIT for the scripts and the report. The .onnx files are research
artefacts; do not deploy them.
Disclosure
Coordinated disclosure on huntr.com against the onnx Model File
Vulnerability target. Please do not file public issues against
microsoft/onnxruntime or onnx/onnx while the disclosure is under
triage.