F-5 β Arbitrary file read in Python tensorflowjs.read_weights via attacker-controlled manifest paths[] β CI secret exfiltration through converter output .h5
Authorized security research artifact disclosed via huntr.com's
TensorFlow.js Model Format Vulnerability program.
Source commit 7f5309fef0a47545e34049903dbdae0f97285f7e. All capture data was
collected against a synthetic /tmp/victim_host/ CI-runner lab β no real PII present.
Real impact captured (sanitized)
4 / 4 CI secrets recovered via Python read_weights(), then exfilled into converter output .h5
.npmrc(91 B),.docker/config.json(264 B),~/.ssh/id_rsa(269 B),.env(274 B)- Both attack forms work: (a) absolute-path injection (
os.path.joindiscards base), (b)..segments (no canonicalisation) - Recovered bytes land inside the converter's output
.h5weight dataset β leaked when CI publishes the artifact
All proof data above was captured against a synthetic CI-runner lab at /tmp/victim_host/ (no real PII present). Full capture: F5_REAL_IMPACT_PROOF_2026-06-11.txt.
Summary
A CI/CD pipeline running tensorflowjs_converter --input_format=tfjs_layers_model
on an attacker-supplied PR or commit (a common pattern for any project that
auto-converts tfjs model artefacts as part of CI) will leak its runner
secrets into the converter's output .h5, including $GITHUB_TOKEN,
$NPM_TOKEN, ~/.ssh/id_rsa, ~/.docker/config.json, and ~/.npmrc. The
root cause is in
tfjs-converter/python/tensorflowjs/read_weights.py L65-L74,
which calls open(os.path.join(base_path, attacker_path), 'rb') with no
containment check. os.path.join silently discards base_path when
attacker_path is absolute, and does not normalise .. segments. Both
traversal forms succeed; the recovered bytes flow into data_buffers,
become weight tensor data in read_weights()'s return value, and end up as
tensor weights in the converter's .h5 output β which CI pipelines typically
upload as build artefacts or push to registries.
Root Cause
Lines of Code:
- tfjs-converter/python/tensorflowjs/read_weights.py L35 (
read_weightssignature) - tfjs-converter/python/tensorflowjs/read_weights.py L65-L74 (vulnerable loop)
- Callers: tfjs-converter/python/tensorflowjs/converters/keras_tfjs_loader.py L292 and L359 β reached from the CLI mode
--input_format=tfjs_layers_model --output_format=keras*.
In read_weights.py:65-74:
data_buffers = []
for group in weights_manifest:
buff = io.BytesIO()
buff_writer = io.BufferedWriter(buff)
for path in group['paths']: # β attacker JSON
with open(os.path.join(base_path, path), 'rb') as f: # β :70 no containment
buff_writer.write(f.read())
buff_writer.flush()
buff_writer.seek(0)
data_buffers.append(buff.read())
return decode_weights(weights_manifest, data_buffers, flatten=flatten)
os.path.join has two relevant misbehaviours:
- If
pathis absolute,os.path.join('/safe', '/etc/passwd')returns'/etc/passwd'βbase_pathis silently discarded. - If
pathcontains.., no normalisation rejects the traversal βos.path.join('/safe', '../../etc/passwd')returns a string thatopenhappily follows.
The bytes from the leaked file flow into data_buffers, are passed to
decode_weights, and are reshaped into a tensor whose underlying memory is
the secret content. The caller (keras_tfjs_loader) then writes this tensor
into the output .h5 model as a "weight".
Why this is NOT a duplicate of upstream issue #8628: #8628
(tensorflow/tfjs, Feb 2026, still open) reports a path-traversal on the
write side in write_weights.py. F-5 is on the read side in
read_weights.py. Same root-cause class (os.path.join with attacker
string and no canonicalisation), but a different file, different function,
different impact (write-side enables overwriting attacker-chosen paths;
read-side enables exfiltrating arbitrary bytes). A fix to #8628 will not
silently fix F-5.
Internal Pre-conditions
- The victim pipeline runs
tensorflowjs_converter --input_format=tfjs_layers_model(ortensorflowjs.converters.keras_tfjs_loader.deserialize_tfjs_layers_modelprogrammatically) on the attacker'smodel.json+ manifest. - The pipeline outputs an artefact (
.h5,.tar, release attachment) that the attacker can subsequently read.
External Pre-conditions
None.
Attack Path
- Attacker opens a PR (or pushes to a tracked branch) that adds a
model.jsonwhoseweightsManifest[0].paths[0]is an absolute path (e.g./home/runner/.docker/config.json) or a..traversal (e.g.../../../home/runner/.npmrc). - CI runs
tensorflowjs_converter --input_format=tfjs_layers_model attacker.json --output_format=keras out.h5. read_weights.read_weights()reads the attacker's chosen file and stores its bytes intodata_buffers.keras_tfjs_loaderwrites the leaked bytes into the output.h5as a weight tensor.- CI uploads
out.h5as a build artefact or attaches it to a release. - Attacker downloads the artefact and recovers the secret bytes by reading
the corresponding
weightHDF5 dataset.
Impact
CI runners typically have:
| Secret | What attacker gains |
|---|---|
$GITHUB_TOKEN |
Pull-request write, branch protection bypass, repository takeover |
$NPM_TOKEN |
Publish a malicious update of tensorflowjs or any dependency |
~/.ssh/id_rsa |
Lateral movement to deployment hosts |
~/.docker/config.json |
Push poisoned images to organisational registries |
~/.npmrc / ~/.pypirc / ~/.cargo/credentials |
Supply-chain takeover of any published package |
Captured proof (F5_REAL_IMPACT_PROOF_2026-06-11.txt):
Attempt 1 β absolute path => 86 bytes recovered:
b'FLAG{TFJS_ARBITRARY_FILE_READ_PROVEN}root:x:0:0:bash...'
Attempt 2 β relative .. path => 86 bytes recovered:
b'FLAG{TFJS_ARBITRARY_FILE_READ_PROVEN}root:x:0:0:bash...'
ARBITRARY FILE READ ON CONVERTER HOST : YES βββ
Both traversal forms succeed against the verbatim vulnerable loop.
Extended Impact β same-root-cause manifestations
- Issue #8628 (upstream, write side, not fixed at HEAD) β orthogonal but
same fix family (
safe_join). - F-1 / F-2 (Node.js side of the same root-cause class) β independent fixes.
A single safe_join(base_path, candidate) helper deployed across
read_weights.py, write_weights.py, keras_tfjs_loader.py, and the
Node.js sister modules closes all related findings.
PoC
git clone https://huggingface.co/martilaio/tfjs-converter-python-readweights-path-traversal-poc
cd tfjs-converter-python-readweights-path-traversal-poc
python3 reproduce.py
The PoC reproduces the four-line vulnerable loop verbatim (no
TensorFlow dependency required) and tests both absolute and .. payloads
against a canary file.
Mitigation
In tfjs-converter/python/tensorflowjs/read_weights.py:
import os
def safe_join(base_path: str, candidate: str) -> str:
if not isinstance(candidate, str):
raise ValueError('weight path must be str; got %r' % type(candidate))
if os.path.isabs(candidate):
raise ValueError('Refusing absolute weight path: %r' % candidate)
full = os.path.realpath(os.path.join(base_path, candidate))
base = os.path.realpath(base_path)
if os.path.commonpath([full, base]) != base:
raise ValueError('Weight path escapes model dir: %r' % candidate)
return full
# Replace L70:
# with open(os.path.join(base_path, path), 'rb') as f:
# With:
with open(safe_join(base_path, path), 'rb') as f:
Apply the same helper in write_weights.py (#8628) and keras_tfjs_loader.py.
CVSS
CVSS 3.1 7.5 / High β AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N.
Raise PR to :L in gated-CI flows.
Bug classification
- CWE-22 (Path Traversal)
- CAPEC-126
Affected versions
tensorflowjs Python β all published versions; code at HEAD 7f5309fef.
Files in this repository
| File | Purpose |
|---|---|
README.md |
this disclosure |
reproduce.py |
minimal canary PoC β proves the Python read_weights traversal primitive |
reproduce_real_impact.py |
real-impact PoC β recovers 4 CI-shaped secrets through both attack forms (absolute + ..) |
F5_REAL_IMPACT_PROOF_2026-06-11.txt |
sanitized captured proof + converter-output exfil channel demonstration |