Parquet Trigger Backdoor PoC

This repository contains a benign security research proof of concept for a Parquet model-file vulnerability report.

Files:

  • control_parquet_classifier.parquet
  • malicious_parquet_trigger.parquet
  • reproduce.py

The malicious Parquet file is loaded with pyarrow.parquet.read_table(). It behaves like the control file for normal inputs, but hidden tensor values in the file change the model output during inference for a trigger input.

Tested runtime:

  • pyarrow==24.0.0
  • numpy==2.4.6
  • trigger entrypoint: pyarrow.parquet.read_table(path); numpy inference

Scanner result:

  • modelscan==0.8.8
  • result: No issues found!
  • skipped reason: Model Scan did not scan file

Public files:

  • Control: https://huggingface.co/hacnho/parquet-trigger-backdoor-poc/resolve/main/control_parquet_classifier.parquet
  • Malicious: https://huggingface.co/hacnho/parquet-trigger-backdoor-poc/resolve/main/malicious_parquet_trigger.parquet
  • Reproducer: https://huggingface.co/hacnho/parquet-trigger-backdoor-poc/resolve/main/reproduce.py

Reproduction:

python -m venv /tmp/parquet-trigger-poc-venv
. /tmp/parquet-trigger-poc-venv/bin/activate
pip install 'numpy' 'pyarrow==24.0.0' 'modelscan==0.8.8'

curl -L -o control_parquet_classifier.parquet \
  https://huggingface.co/hacnho/parquet-trigger-backdoor-poc/resolve/main/control_parquet_classifier.parquet
curl -L -o malicious_parquet_trigger.parquet \
  https://huggingface.co/hacnho/parquet-trigger-backdoor-poc/resolve/main/malicious_parquet_trigger.parquet
curl -L -o reproduce.py \
  https://huggingface.co/hacnho/parquet-trigger-backdoor-poc/resolve/main/reproduce.py

python reproduce.py control_parquet_classifier.parquet malicious_parquet_trigger.parquet

Expected output:

"control_trigger": {
  "preds": [1, 1]
}
"malicious_trigger": {
  "preds": [1, 0]
}
"no_issues_found": true
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support