You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

CNTK v2 model load out-of-bounds read (SIGSEGV) PoC

This repository is a security proof-of-concept for an out-of-bounds read in Microsoft Cognitive Toolkit (CNTK) v2 when loading a crafted model file with cntk.Function.load() / cntk.load_model().

It is gated on purpose. The crafted files crash a native deserializer; do not load them outside a throwaway environment.

What is here

  • evil_gatherpacked.cntkmodel crafted model, op GatherPacked, zero inputs
  • evil_packedindex.cntkmodel crafted model, op PackedIndex, zero inputs
  • evil_scatterpacked.cntkmodel crafted model, op ScatterPacked, zero inputs
  • good.cntkmodel benign control model (loads fine)
  • verify.py offline differential verifier (child process per load, prints exit codes)
  • generate.py regenerates all crafted files from scratch
  • CNTK.proto / CNTK_pb2.py the on-disk format schema used to craft the files

Root cause

The CNTK v2 on-disk format is a protobuf-serialized Dictionary tree. Loading runs CNTK::Function::Load -> CompositeFunction::Deserialize, which rebuilds each PrimitiveFunction and then calls RawOutputs() -> InitOutputs() -> PrimitiveFunction::InferOutputs() -> PrimitiveFunction::GetOutputDynamicAxes() while still inside the load call.

GetOutputDynamicAxes (Source/CNTKv2LibraryDll/PrimitiveFunction.cpp) indexes the operand vector by fixed position based on the op, with no bounds check:

else if (op == PrimitiveOpType::ScatterPacked)
    outputDynamicAxes = inputs[2].DynamicAxes();
else if ((op == PrimitiveOpType::PackedIndex) || (op == PrimitiveOpType::GatherPacked))
    outputDynamicAxes = inputs[1].DynamicAxes();

inputs is the deserialized operand list. Its length is taken verbatim from the model file (GetInputVariables) and is never checked against the arity the op requires. A crafted function with one of these ops and an empty inputs vector makes inputs[1] / inputs[2] read past the end of a std::vector<Variable>. The out-of-bounds Variable holds a garbage m_dataFields pointer, and Variable::DynamicAxes() dereferences it, faulting.

Observed result (Linux, CNTK 2.7 CPU, Python 3.6)

good.cntkmodel                   rc=0    ok        benign control  LOADED_OK
evil_packedindex.cntkmodel       rc=-11  SIGSEGV   op=PackedIndex(28),  0 inputs -> inputs[1] OOB
evil_gatherpacked.cntkmodel      rc=-11  SIGSEGV   op=GatherPacked(29), 0 inputs -> inputs[1] OOB
evil_scatterpacked.cntkmodel     rc=-11  SIGSEGV   op=ScatterPacked(30),0 inputs -> inputs[2] OOB

Backtrace at the fault:

#0 CNTK::Variable::DynamicAxes() const
#1 CNTK::PrimitiveFunction::GetOutputDynamicAxes(...)
#2 CNTK::PrimitiveFunction::InferOutputs(...)
#6 CNTK::Function::InitOutputs()
#7 CNTK::CompositeFunction::Deserialize(...)
#8 CNTK::Function::Deserialize(...)
#9 CNTK::Function::Load(...)
#10 _wrap_Function_load   (cntk.Function.load)

Only the crafted mutation crashes; the otherwise-identical benign model loads fine. That differential rules out a generic large-allocation failure.

Reproduce

CNTK 2.7 ships CPU wheels for Python 3.6 only (manylinux1). In a Python 3.6 env with cntk==2.7 installed and the OpenMPI 1.10 runtime (libmpi.so.12) on the library path:

python verify.py

To rebuild the crafted files from a fresh benign model:

python -m grpc_tools.protoc -I. --python_out=. CNTK.proto
python generate.py
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support