INF-003: Model Deserialization RCE
| Category | Infrastructure |
| Frameworks | ATLAS: ML Supply Chain · OWASP: LLM05 |
Exploit unsafe deserialization in model loading. Many ML frameworks use serialization formats that allow arbitrary code execution when loading untrusted model files.
Technique
# Vulnerable formats:
- Python serialization (most ML frameworks)
Arbitrary code execution on load
- PyTorch .pt/.pth files
Serialized Python objects
- Joblib files (scikit-learn)
Same serialization risk
# Safe alternatives:
- SafeTensors (weights only, no code)
- ONNX (computation graph, no arbitrary code)
- GGUF (llama.cpp format, weights only)
# Attack: upload poisoned model file to
# MLflow/HuggingFace -> code runs on load
Key Concepts
- Python's native serialization is the root of the problem. Most ML frameworks (PyTorch, scikit-learn, joblib) use Python's object serialization, which can execute arbitrary Python code during deserialization. Loading an untrusted .pt, .pth, or serialized file is equivalent to running arbitrary code with the privileges of the loading process.
- The attack surface is the model supply chain. Models are routinely downloaded from public registries (HuggingFace Hub, MLflow Model Registry), shared between teams, and loaded automatically in CI/CD pipelines. A poisoned model file injected at any point in this chain executes code when loaded.
- Safe formats exist but adoption is incomplete. SafeTensors stores only tensor weights without code execution capability. ONNX defines computation graphs without arbitrary code. GGUF is a weights-only format. However, many workflows still default to unsafe serialization formats for convenience or compatibility.
- The execution happens silently during model loading. There is no user interaction, confirmation dialog, or sandbox. The code embedded in the serialized object runs immediately when the model load function is called, often before any inference request is made.
- Container escape compounds the impact. Model loading typically runs in environments with GPU access and network connectivity. Code execution gained through deserialization can escalate to container escape, lateral movement, or data exfiltration from the training infrastructure.
Detection
- Scan model files before loading with tools like fickling or modelscan. These tools analyze serialized model files for embedded code execution payloads without actually deserializing them, catching malicious models before they execute.
- Monitor for unexpected process spawning during model loading. Legitimate model loading should not spawn subprocesses, open network connections, or write to unexpected filesystem paths. Any such behavior during a load call is a strong indicator of deserialization exploitation.
- Verify model file integrity with cryptographic hashes. Maintain a registry of trusted model file hashes and verify every model file against this registry before loading. This prevents substitution attacks where a legitimate model file is replaced with a poisoned one.
Mitigation
- Mandate SafeTensors or other safe formats for all model storage and transfer. Prohibit serialization-based formats (.pt, .pth, .joblib) in production pipelines. Use
torch.load(weights_only=True)when legacy .pt files must be loaded. - Sandbox model loading in isolated environments. Run model deserialization in containers with no network access, read-only filesystems (except the model output directory), and minimal privileges. If code execution occurs, the blast radius is contained.
- Implement model provenance tracking. Require signed model files with verifiable provenance from training through deployment. Reject models that cannot be traced to a trusted training pipeline with a verified chain of custody.