F-17 β Infinite loop in tensorflowjs_converter StatelessWhile monopolises microtask queue β in-process watchdog never fires
Authorized security research artifact disclosed via huntr.com's
TensorFlow.js Model Format Vulnerability program.
Source commit 7f5309fef0a47545e34049903dbdae0f97285f7e. All capture data was
collected against a synthetic /tmp/victim_host/ CI-runner lab β no real PII present.
Real impact captured (sanitized)
In-process watchdog never fires β process must be SIGKILLed externally
- Parent spawned child with
setTimeout(watchdog, 1500ms)inside the executor β never fired - 8 s wall-clock burn β only parent's external SIGKILL terminated the child
- Confirms the bug bypasses every existing in-process timeout / cancellation primitive
All proof data above was captured against a synthetic CI-runner lab at /tmp/victim_host/ (no real PII present). Full capture: F17_REAL_IMPACT_PROOF_2026-06-11.txt.
Summary
A Node.js service that calls model.executeAsync(...) on an attacker-supplied
GraphModel containing a StatelessWhile (or While) op with a permanently
truthy condition will hang forever β and any setTimeout watchdog or
HTTP request timeout the caller relies on cannot fire because the loop
runs as a continuous chain of microtasks that never returns control to the
macrotask queue. Only an external SIGKILL (from a supervisor, container
OOM, or K8s liveness probe) recovers the process.
The PoC confirms this end-to-end: child hung for 8 s, parent watchdog issued
SIGKILL, exit code null, signal SIGKILL.
Root Cause
Lines of Code:
In control_executor.ts:49-103:
case 'While':
case 'StatelessWhile': {
const bodyFunc = getParamValue('body', node, tensorMap, context) as string;
const condFunc = getParamValue('cond', node, tensorMap, context) as string;
const args = getParamValue('args', node, tensorMap, context) as Tensor[];
const condResult = await context.functionMap[condFunc].executeFunctionAsync(...);
let condValue = await condResult[0].data();
let result: Tensor[] = args;
while (condValue[0]) { // β no iteration cap
result = await context.functionMap[bodyFunc].executeFunctionAsync(...);
const condResult = await context.functionMap[condFunc].executeFunctionAsync(...);
condValue = await condResult[0].data();
}
return result;
}
TF's protobuf WhileLoop op carries a maximum_iterations attribute
precisely to bound this loop. tfjs's operation mapper parses the attr
into node.attr['maximum_iterations'], but the executor never reads or
enforces it.
Why a setTimeout watchdog cannot save you
Each iteration is a chain of microtasks
(await context.functionMap[bodyFunc].executeFunctionAsync(...) β
await context.functionMap[condFunc].executeFunctionAsync(...) β
await condResult[0].data()). Node drains the microtask queue between any
two macrotasks, so the loop runs forever without ever returning control to
the macrotask queue.
In-process watchdogs (setTimeout, HTTP server request timeouts, Express
req.setTimeout, k6 client-side time limits, custom abort controllers
driven by setInterval) cannot fire because they are macrotasks.
Only an external SIGKILL recovers the process.
Why this is NOT a duplicate of F-23 (mutual function recursion): F-23
uses StatelessIf calling itself or a sibling function unboundedly through
the GraphDef's library.function table β exhausting frames via recursion.
F-17 uses StatelessWhile with no recursion β exhausting iteration count
via a flat loop. Different op (While vs If), different attack shape
(loop vs recursion), independent fix (read maximum_iterations attr vs add
a recursion-depth counter). Both are bundled into the executor file but
admit different patches.
Internal Pre-conditions
- Victim Node.js process calls
tf.loadGraphModel(<url>)followed bymodel.executeAsync(...)on the attacker model. - Process uses
@tensorflow/tfjs-converterβ€ 4.22.0.
External Pre-conditions
None.
Attack Path
- Attacker authors a
model.jsonGraphDef with two function definitions inlibrary.function:cond_fn(x, t)=Greater(x, t)(withx=0.0,t=-1.0β permanentlytrue),body_fn(x, t)=(Identity(x), Identity(t))(no progress).
- Top-level node:
StatelessWhile(args=[0.0, -1.0], cond=cond_fn, body=body_fn). - Attacker delivers
model.json+ 24-byte weight shard to the victim. - Victim loads the model and calls
model.executeAsync({}, ['loop']). - The executor enters
while (condValue[0])and never exits. The Promise returned to the caller never resolves. - Victim's
setTimeout(req.abort, 30_000)watchdog never fires; the service worker is permanently consumed; container metrics show 100% CPU forever. - Only an external SIGKILL β supervisor, K8s liveness probe, OOM-killer β recovers the worker.
Impact
Captured PoC F17_REAL_IMPACT_PROOF_2026-06-11.txt:
PoC F-17 v2 β StatelessWhile(cond=true forever)
init = 0.0 (carry state)
thresh = -1.0
condition: Greater(x, -1.0) β always true
body: (Identity(x), Identity(t)) β no progress
[watchdog 8s SIGKILL]
[exit c=null s=SIGKILL]
exit c=null s=SIGKILL is conclusive: executeAsync never returned; the
in-process setTimeout watchdog never fired; the only path to recover the
process was an external supervisor kill.
Service-level impact:
- Each worker that handles one malicious model.executeAsync request becomes permanently consumed.
- Sustained attack β full service DoS as every worker is taken offline.
- Even k8s/Docker SIGKILL recovery is slow (deployments must roll new pods) β easy to keep the service offline indefinitely.
Mitigation
In control_executor.ts:49-103, read maximum_iterations from the node
attr (already parsed by the operation mapper) and break when exceeded:
case 'While':
case 'StatelessWhile': {
const maxIters = (getParamValue(
'maximum_iterations', node, tensorMap, context) as number) ?? Infinity;
const HARD_DEFAULT_CEILING = 1_000_000;
let i = 0;
while (condValue[0]) {
if (i++ >= Math.min(maxIters, HARD_DEFAULT_CEILING)) {
throw new ValueError(
`StatelessWhile exceeded maximum_iterations=${maxIters} (ceiling ${HARD_DEFAULT_CEILING})`);
}
result = await context.functionMap[bodyFunc].executeFunctionAsync(...);
...
}
}
Fall back to a hard ceiling (e.g. 10βΆ) when the attr is missing.
CVSS
CVSS 3.1 7.5 / High β AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H.
Bug classification
- CWE-835 (Loop with Unreachable Exit Condition β "Infinite Loop")
- CWE-770 (Allocation of Resources Without Limits)
Affected versions
@tensorflow/tfjs-converter β€ 4.22.0.
Files in this repository
| File | Purpose |
|---|---|
README.md |
this disclosure |
package.json |
npm dependencies for one-step npm install |
reproduce.js |
minimal PoC β StatelessWhile with always-true predicate monopolises the microtask queue |
reproduce_real_impact.js |
watchdog-bypass demo β spawns child with in-process setTimeout(1500ms) watchdog; only external SIGKILL terminates |
F17_REAL_IMPACT_PROOF_2026-06-11.txt |
captured 8 s wall-clock burn with in-process watchdog never firing |