You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

ONNX dual-data-field backdoor: proof of concept

Security research artefact. The .onnx files in this repository are not real models, only minimal PoCs. Do not deploy them. They accompany a coordinated huntr.com disclosure of an integrity flaw in the ONNX format that lets a single model show one set of weights to anyone reading TensorProto.float_data and run a different set through onnxruntime.InferenceSession.

TL;DR

I built a 602-byte content-safety classifier that:

  • passes onnx.checker.check_model(path, full_check=True);
  • looks like a normally-trained logistic regression to anyone who walks the proto and reads tensor.float_data (mean ≈ 0.054, range [-4.14, 4.60]);
  • classifies 5/5 obviously-unsafe inputs as SAFE when run through onnxruntime.InferenceSession;
  • emits zero log lines from onnxruntime at any severity (verbose, info, warning, error).

I hide the runtime weights inside a model-local function. The tensor at FunctionProto.attribute_proto[N].t is a TensorProto, but the "populate at most one value field" rule is not enforced for it by onnx.checker. ORT inlines the function and prefers raw_data over float_data when both are set, so the same file bytes produce two different weight sets for two different consumers.

Files

File Size SHA-256
trojan.onnx 602 B e5546c7f9b8101eb467b9e50fb94cbb82d8afd1c91eefc0086667b1b8e2d6f7b
clean.onnx 534 B 58992ac1ae8fe1733229df231a7564726b2c6ae65aa15d2f4442515582d1271e
baseline.onnx 290 B aadef18495adf70dd11b82d7e27f2b5bc30f883b3c6a908b3cafbc10eaa53165
build_poc.py rebuilds the three models from training_weights.json
verify_poc.py runs the checker, audit view, and onnxruntime inference
requirements.txt pinned versions used for the report
training_weights.json the trained logistic-regression weights I baked in

trojan.onnx and clean.onnx differ only by the raw_data field on the function-attribute tensor. baseline.onnx shows the older "top-level initializer" form of the same class for context.

Quickstart

pip install -r requirements.txt
python build_poc.py
python verify_poc.py

Expected tail of the verifier output for the trojan model:

[checker]
  check_model: ACCEPTED
  check_model_full: ACCEPTED

[audit_view]  dual-field tensor at: functions[Classify].attribute_proto[W_default].t
  raw_data size: 64 bytes
  what an auditor sees (float_data):  mean=0.054499  min=-4.138244  max=4.600643
  what onnxruntime sees (raw_data):  mean=-50.0  min=-50.0  max=-50.0

[inference] labels:    ['SAFE','SAFE','SAFE','SAFE','SAFE','UNSAFE','UNSAFE','UNSAFE','UNSAFE','UNSAFE']
            P(unsafe): [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
            pred:      ['SAFE','SAFE','SAFE','SAFE','SAFE','SAFE','SAFE','SAFE','SAFE','SAFE']
            correct 5/10, UNSAFE classified as SAFE: 5/5

If the trojan model says SAFE for every UNSAFE input on your machine, the backdoor reproduces.

What the bug is

ONNX's TensorProto has seven mutually-exclusive value carriers:

float_data | int32_data | int64_data | double_data | uint64_data | string_data | raw_data

The spec rule "populate exactly one of these" is enforced by onnx.checker.check_model for every TensorProto it visits, and the checker visits graph.initializer, every Constant.value attribute, the values and indices of every SparseTensorProto, and every subgraph initializer reached through If / Loop / Scan.

It does not visit FunctionProto.attribute_proto[N].t. That field holds the default value of a typed function attribute, and a Constant node inside the function body can pull from it via ref_attr_name. Here is what happens at runtime:

  1. onnxruntime.InferenceSession(path) loads the model.
  2. The top-level graph calls a model-local function.
  3. ORT inlines the function. The body's Constant.value resolves ref_attr_name="W_default" to the function's attribute_proto[0].t default, which is the dual-field tensor.
  4. The inlined initializer with both float_data and raw_data populated now sits at the top level. ORT's top-level initializer load path prefers raw_data and does not re-run the spec check.
  5. MatMul(x, W) + b runs against the attacker-controlled raw_data weights.

ORT does not emit a log line at any severity for steps 3 and 4.

baseline.onnx is the simpler form: a top-level graph.initializer with both float_data and raw_data populated. onnx.checker rejects that one with "TensorProto should contain one and only one value field", but onnxruntime still loads and runs raw_data with no error or warning. That is the H-1 form. The V6 form in trojan.onnx is the one that gets past the checker as well.

Threat model

The attacker is anyone who can publish or substitute an .onnx file that the victim will load with onnxruntime.InferenceSession. The canonical paths are:

  • a malicious upload to a model hub (HuggingFace, ONNX Model Zoo, internal registries);
  • a compromised upstream artefact in an MLOps pipeline;
  • a malicious open-source contribution that ships a pretrained model.

The victim is anyone who:

  • runs onnx.checker.check_model(path) (with or without full_check=True) and gates deployment on the result, and / or
  • audits the model by walking model.graph and reading TensorProto.float_data for known-good values, and / or
  • loads through onnxruntime.InferenceSession(path).

There are no preconditions on the host filesystem, no external data files, no symlinks, no special operating system, no custom operators or plugin loading, no environment variables, and no input shape tricks. The payload is a single attribute on a single tensor.

How this differs from huntr report 9359efee (rejected)

huntr submission 9359efee-b6fe-46cf-bb05-8e060b1c296f was rejected for not demonstrating real consumer impact. This report is the inverse of that one:

Report 9359efee (rejected) This report
Bug claim checker too lenient: accepts under-filled tensor runtime too lenient: accepts dual-field tensor
Field state typed-data shorter than prod(dims) typed-data and raw_data both populated, conflicting
onnx.checker accepts (the reported bug) accepts the V6 model; rejects only the baseline
numpy_helper.to_array fails with reshape error returns raw_data silently
ort.InferenceSession fails at init with size-mismatch runs raw_data silently, no log line
Downstream impact manually-constructed C++ ASan repro; no real consumer harmed end-to-end backdoor that misclassifies every UNSAFE input as SAFE

The reason 9359efee was rejected does not apply here. Default deployment paths run the trojan, and onnx.checker accepts the file unchanged.

What other ONNX tools do with the trojan

I ran the trojan through every default-deployment ONNX tool I had installed. Each one either accepts it or skips it. Tested on onnx 1.21.0 and onnxruntime 1.26.0:

Consumer Result on trojan.onnx
onnx.checker.check_model(path) ACCEPTED
onnx.checker.check_model(path, full_check=True) ACCEPTED
onnx.numpy_helper.to_array(t) returns the raw_data view (-50.0 across)
onnx.shape_inference.infer_shapes(m) preserves the dual-field state
onnxoptimizer.optimize(m, all_passes) preserves it across all 48 default passes
onnxsim.simplify(m) reports check_ok=True, preserves it
polygraphy.backend.onnx.OnnxFromPath loads cleanly, preserves it
onnx.save_model(m, path) round-trip preserves it on disk; reload still passes the checker
onnxruntime.InferenceSession(path) initialises silently, runs raw_data, classifies all UNSAFE inputs as SAFE
modelscan 0.8.8 SCAN_NOT_SUPPORTED (no .onnx scanner registered)

The modelscan row is here as defence-in-depth context, not as a scanner-bypass claim. A coverage gap is not the same thing as evading an active check.

Suggested fix

Three layered changes. Any one of them closes the worst path on its own; together they cover everything.

Fix 1 (recommended): reject dual-field tensors at runtime in onnxruntime

In onnxruntime/core/framework/tensorprotoutils.cc, alongside the existing ValidateEmbeddedTensorProtoDataSizeAndShape, add a check that rejects any TensorProto with more than one value-carrier field populated:

inline Status ValidateSingleDataFieldOrFail(const TensorProto& t) {
  int populated = 0;
  if (!t.float_data().empty())  ++populated;
  if (!t.int32_data().empty())  ++populated;
  if (!t.int64_data().empty())  ++populated;
  if (!t.double_data().empty()) ++populated;
  if (!t.uint64_data().empty()) ++populated;
  if (!t.string_data().empty()) ++populated;
  if (!t.raw_data().empty())    ++populated;
  if (populated > 1) {
    return ORT_MAKE_STATUS(
      ONNXRUNTIME, INVALID_ARGUMENT,
      "TensorProto '", t.name(),
      "' has more than one value field populated. ONNX requires "
      "at most one of (raw_data, float_data, int32_data, "
      "int64_data, double_data, uint64_data, string_data) to be set.");
  }
  return Status::OK();
}

Call this from the embedded-tensor load path and from the function-inlining path that copies attribute defaults into top-level constants. The cost is a handful of empty() checks. The benefit is closing the runtime side everywhere, whether or not the user remembers to call the checker.

Fix 2 (recommended): close the check_function coverage gap in onnx

onnx/onnx/checker.cc::check_function walks function.input(), function.output(), function.node(), and function.attribute(), but never iterates function.attribute_proto(). The typed-default TensorProto at attribute_proto[N].t never reaches check_tensor, and the "one and only one value field" rule never fires on it. Add a small loop after the existing duplicate-name check:

CheckerContext ctx_attr(ctx_copy);
ctx_attr.set_is_main_graph(false);
for (const auto& ap : function.attribute_proto()) {
  check_attribute(ap, ctx_attr, lex_ctx);
}

check_attribute already runs the "populate at most one value field" rule, and dispatches to check_tensor for TENSOR defaults, check_sparse_tensor for SPARSE_TENSOR defaults, and check_graph for GRAPH defaults. Routing function-attribute defaults through it gives them the same treatment node attributes already get. A regression test that hand-builds a FunctionProto with a dual-field TensorProto in attribute_proto[0].t and asserts that check_model raises ValidationError belongs alongside the change.

Fix 3 (specification): make the rule a normative MUST

In onnx.proto (and onnx-ml.proto), turn the existing "populate exactly one of these value-carrier fields" guideline on TensorProto into a binding MUST, with explicit guidance that consumers must enforce it or document the deviation. The current docstring leaves room for implementations to disagree on whether the rule is normative or just advisory, and that ambiguity is what lets the V6 divergence happen.

Why "just call the checker" is not enough

Telling users to call onnx.checker.check_model(path, full_check=True) is the obvious workaround for the older "top-level initializer" form of this bug, which is what baseline.onnx shows. It does not cover the V6 form because:

  • trojan.onnx passes the checker;
  • numpy_helper.to_array returns the raw_data view silently for the same tensor, so an audit pipeline that uses the canonical helper can also be misled;
  • onnxoptimizer.optimize, onnxsim.simplify, and polygraphy.backend.onnx.OnnxFromPath all preserve the dual-field state on round-trip, so a CI/CD pipeline that re-emits the model after running these passes does not launder the divergence away.

The runtime-side change in Fix 1 is the smallest single change that closes the worst path, whether or not the user calls the checker.

Versions tested

Component Version
onnx 1.21.0
onnxruntime 1.26.0
numpy 2.4.x
Python 3.12

The bug class is platform-independent. The ORT tensor-loading and function-inlining paths that the runtime side relies on are shared C++ across Windows, Linux, and macOS.

License

MIT for the scripts and the report. The .onnx files are research artefacts; do not deploy them.

Disclosure

Coordinated disclosure on huntr.com against the onnx Model File Vulnerability target. Please do not file public issues against microsoft/onnxruntime or onnx/onnx while the disclosure is under triage.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support