You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

F-5 β€” Arbitrary file read in Python tensorflowjs.read_weights via attacker-controlled manifest paths[] β†’ CI secret exfiltration through converter output .h5

Authorized security research artifact disclosed via huntr.com's TensorFlow.js Model Format Vulnerability program. Source commit 7f5309fef0a47545e34049903dbdae0f97285f7e. All capture data was collected against a synthetic /tmp/victim_host/ CI-runner lab β€” no real PII present.

Real impact captured (sanitized)

4 / 4 CI secrets recovered via Python read_weights(), then exfilled into converter output .h5

  • .npmrc (91 B), .docker/config.json (264 B), ~/.ssh/id_rsa (269 B), .env (274 B)
  • Both attack forms work: (a) absolute-path injection (os.path.join discards base), (b) .. segments (no canonicalisation)
  • Recovered bytes land inside the converter's output .h5 weight dataset β†’ leaked when CI publishes the artifact

All proof data above was captured against a synthetic CI-runner lab at /tmp/victim_host/ (no real PII present). Full capture: F5_REAL_IMPACT_PROOF_2026-06-11.txt.


Summary

A CI/CD pipeline running tensorflowjs_converter --input_format=tfjs_layers_model on an attacker-supplied PR or commit (a common pattern for any project that auto-converts tfjs model artefacts as part of CI) will leak its runner secrets into the converter's output .h5, including $GITHUB_TOKEN, $NPM_TOKEN, ~/.ssh/id_rsa, ~/.docker/config.json, and ~/.npmrc. The root cause is in tfjs-converter/python/tensorflowjs/read_weights.py L65-L74, which calls open(os.path.join(base_path, attacker_path), 'rb') with no containment check. os.path.join silently discards base_path when attacker_path is absolute, and does not normalise .. segments. Both traversal forms succeed; the recovered bytes flow into data_buffers, become weight tensor data in read_weights()'s return value, and end up as tensor weights in the converter's .h5 output β€” which CI pipelines typically upload as build artefacts or push to registries.

Root Cause

Lines of Code:

In read_weights.py:65-74:

data_buffers = []
for group in weights_manifest:
    buff = io.BytesIO()
    buff_writer = io.BufferedWriter(buff)
    for path in group['paths']:                                # ← attacker JSON
        with open(os.path.join(base_path, path), 'rb') as f:   # ← :70 no containment
            buff_writer.write(f.read())
    buff_writer.flush()
    buff_writer.seek(0)
    data_buffers.append(buff.read())
return decode_weights(weights_manifest, data_buffers, flatten=flatten)

os.path.join has two relevant misbehaviours:

  1. If path is absolute, os.path.join('/safe', '/etc/passwd') returns '/etc/passwd' β€” base_path is silently discarded.
  2. If path contains .., no normalisation rejects the traversal β€” os.path.join('/safe', '../../etc/passwd') returns a string that open happily follows.

The bytes from the leaked file flow into data_buffers, are passed to decode_weights, and are reshaped into a tensor whose underlying memory is the secret content. The caller (keras_tfjs_loader) then writes this tensor into the output .h5 model as a "weight".

Why this is NOT a duplicate of upstream issue #8628: #8628 (tensorflow/tfjs, Feb 2026, still open) reports a path-traversal on the write side in write_weights.py. F-5 is on the read side in read_weights.py. Same root-cause class (os.path.join with attacker string and no canonicalisation), but a different file, different function, different impact (write-side enables overwriting attacker-chosen paths; read-side enables exfiltrating arbitrary bytes). A fix to #8628 will not silently fix F-5.

Internal Pre-conditions

  1. The victim pipeline runs tensorflowjs_converter --input_format=tfjs_layers_model (or tensorflowjs.converters.keras_tfjs_loader.deserialize_tfjs_layers_model programmatically) on the attacker's model.json + manifest.
  2. The pipeline outputs an artefact (.h5, .tar, release attachment) that the attacker can subsequently read.

External Pre-conditions

None.

Attack Path

  1. Attacker opens a PR (or pushes to a tracked branch) that adds a model.json whose weightsManifest[0].paths[0] is an absolute path (e.g. /home/runner/.docker/config.json) or a .. traversal (e.g. ../../../home/runner/.npmrc).
  2. CI runs tensorflowjs_converter --input_format=tfjs_layers_model attacker.json --output_format=keras out.h5.
  3. read_weights.read_weights() reads the attacker's chosen file and stores its bytes into data_buffers.
  4. keras_tfjs_loader writes the leaked bytes into the output .h5 as a weight tensor.
  5. CI uploads out.h5 as a build artefact or attaches it to a release.
  6. Attacker downloads the artefact and recovers the secret bytes by reading the corresponding weight HDF5 dataset.

Impact

CI runners typically have:

Secret What attacker gains
$GITHUB_TOKEN Pull-request write, branch protection bypass, repository takeover
$NPM_TOKEN Publish a malicious update of tensorflowjs or any dependency
~/.ssh/id_rsa Lateral movement to deployment hosts
~/.docker/config.json Push poisoned images to organisational registries
~/.npmrc / ~/.pypirc / ~/.cargo/credentials Supply-chain takeover of any published package

Captured proof (F5_REAL_IMPACT_PROOF_2026-06-11.txt):

Attempt 1 β€” absolute path  => 86 bytes recovered:
   b'FLAG{TFJS_ARBITRARY_FILE_READ_PROVEN}root:x:0:0:bash...'
Attempt 2 β€” relative .. path => 86 bytes recovered:
   b'FLAG{TFJS_ARBITRARY_FILE_READ_PROVEN}root:x:0:0:bash...'
ARBITRARY FILE READ ON CONVERTER HOST : YES  βœ“βœ“βœ“

Both traversal forms succeed against the verbatim vulnerable loop.

Extended Impact β€” same-root-cause manifestations

  • Issue #8628 (upstream, write side, not fixed at HEAD) β€” orthogonal but same fix family (safe_join).
  • F-1 / F-2 (Node.js side of the same root-cause class) β€” independent fixes.

A single safe_join(base_path, candidate) helper deployed across read_weights.py, write_weights.py, keras_tfjs_loader.py, and the Node.js sister modules closes all related findings.

PoC

git clone https://huggingface.co/martilaio/tfjs-converter-python-readweights-path-traversal-poc
cd tfjs-converter-python-readweights-path-traversal-poc
python3 reproduce.py

The PoC reproduces the four-line vulnerable loop verbatim (no TensorFlow dependency required) and tests both absolute and .. payloads against a canary file.

Mitigation

In tfjs-converter/python/tensorflowjs/read_weights.py:

import os

def safe_join(base_path: str, candidate: str) -> str:
    if not isinstance(candidate, str):
        raise ValueError('weight path must be str; got %r' % type(candidate))
    if os.path.isabs(candidate):
        raise ValueError('Refusing absolute weight path: %r' % candidate)
    full = os.path.realpath(os.path.join(base_path, candidate))
    base = os.path.realpath(base_path)
    if os.path.commonpath([full, base]) != base:
        raise ValueError('Weight path escapes model dir: %r' % candidate)
    return full

# Replace L70:
#   with open(os.path.join(base_path, path), 'rb') as f:
# With:
    with open(safe_join(base_path, path), 'rb') as f:

Apply the same helper in write_weights.py (#8628) and keras_tfjs_loader.py.

CVSS

CVSS 3.1 7.5 / High β€” AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N. Raise PR to :L in gated-CI flows.

Bug classification

  • CWE-22 (Path Traversal)
  • CAPEC-126

Affected versions

tensorflowjs Python β€” all published versions; code at HEAD 7f5309fef.

Files in this repository

File Purpose
README.md this disclosure
reproduce.py minimal canary PoC β€” proves the Python read_weights traversal primitive
reproduce_real_impact.py real-impact PoC β€” recovers 4 CI-shaped secrets through both attack forms (absolute + ..)
F5_REAL_IMPACT_PROOF_2026-06-11.txt sanitized captured proof + converter-output exfil channel demonstration
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support