You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

TensorRT EmbLayerNorm cumulative-sequence write PoC

This repository contains a benign reproduction for unchecked cu_seqlens values in TensorRT's current variable-sequence EmbLayerNorm v4 and v5 plugins.

The local proof writes synthetic values into controlled CUDA guard space. The Triton proof changes only a synthetic co-resident victim model's numeric output. It does not execute code or access real user data.

See REPORT.md for the technical write-up and triton_repeatability.txt for three fresh-server runs.

Reproduce

Local TensorRT proof:

python make_engine.py
python run_probe.py emb_cuseqlens_hface_fp16.engine

Triton service proof (after adapting start_triton.sh to the local Triton installation if necessary):

bash start_triton.sh
python triton_probe.py load
python triton_probe.py race 20 50

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support