Secure Coding Practices for Python Library Developers
“It’s just a simple utility function,” I thought to myself as I reviewed the code. “What could go wrong?” Then I saw it: the function was blindly executing shell commands based on user input. One carefully crafted argument later, and someone could have turned our helpful library into their personal backdoor.
When you’re building a Python library, you’re not just writing code—you’re creating something that other developers will trust and incorporate into their systems. That trust comes with responsibility. In this post, I’ll share the security practices I’ve learned that help me sleep better at night knowing I’m not accidentally creating vulnerabilities for my users.
Trust No One: The Art of Input Validation
Remember that scene in The Matrix where Neo learns not to trust what he sees? That’s how you should approach any input coming into your library. Whether it’s a function argument, a file’s contents, or a network response—treat it all as potentially malicious. Here’s how I approach this:
from typing import Optional
import re
def process_user_data(input_string: str) -> str:
# First, validate the type
if not isinstance(input_string, str):
raise ValueError("Input must be a string")
# Then, check length constraints
if len(input_string) > 1000: # Reasonable limit for our use case
raise ValueError("Input string too long")
# Finally, validate the format (e.g., only alphanumeric)
if not input_string.isalnum():
raise ValueError("Input must be alphanumeric")
# Now we can safely process the input
return input_string.lower()
def validate_email(email: str) -> bool:
"""Validate email using a simple regex pattern."""
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
return bool(re.match(pattern, email))
def sanitize_filename(filename: str) -> Optional[str]:
"""Sanitize a filename to prevent path traversal."""
# Remove any directory components
clean_name = os.path.basename(filename)
# Only allow certain characters
if not re.match(r'^[a-zA-Z0-9._-]+$', clean_name):
return None
return clean_name
I’ve learned to prefer allowlisting (specifying what’s allowed) over denylisting (trying to block bad stuff). Why? Because I’m not clever enough to think of all the ways someone might try to break my code, but I can definitely specify what valid input looks like.
The Principle of Least Privilege: Less is More
Think of permissions like superpowers—cool to have, but with great power comes great responsibility. I try to follow this simple rule: if my code doesn’t absolutely need a permission to function, it shouldn’t have it.
Here’s an example:
# Before: Reading the entire config directory
def load_config():
return glob.glob("/etc/myapp/**/*")
# After: Reading only what we need
def load_config():
config_file = "/etc/myapp/config.yaml"
if not os.path.exists(config_file):
raise FileNotFoundError(f"Config file not found: {config_file}")
return [config_file]
Graceful Error Handling: The Art of Saying Nothing
One of my favorite security principles is that good error messages are like good waiters—helpful when needed but not revealing what’s happening in the kitchen. Here’s how I structure error handling:
try:
result = process_sensitive_data(user_input)
except ValueError as e:
# Log the detailed error for debugging
logger.debug(f"Validation error: {str(e)}", exc_info=True)
# But only show a generic message to the user
raise ValueError("Invalid input provided") from None
except Exception as e:
# Log any unexpected errors
logger.error("Unexpected error", exc_info=True)
# Generic message for users
raise RuntimeError("An unexpected error occurred") from None
But take a close look at this: where do these logs go? Is that ok?
Dependencies: Your Code’s Extended Family
Your library is only as secure as its weakest dependency. I learned this the hard way when a tiny utility package we used turned out to have a critical vulnerability. Now I follow these rules:
- Keep a lean dependency list—every package you add is code you’re trusting
- Use
pip-audit
regularly to check for known vulnerabilities - Set up Dependabot to automatically update dependencies
Here’s what my typical GitHub Actions workflow looks like:
name: Security Checks
on: [push, pull_request]
jobs:
security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- name: Install dependencies
run: |
pip install pip-audit bandit
- name: Check dependencies
run: pip-audit
- name: Run security linter
run: bandit -r .
The Standard Library: Friend or Foe?
Even Python’s standard library can be a source of security issues if used carelessly. Here are some patterns I’ve learned to avoid:
# DON'T: Unsafe deserialization
import pickle
data = pickle.loads(user_input) # Could execute arbitrary code!
# DO: Use safer alternatives
import json
data = json.loads(user_input) # Much safer!
# DON'T: Shell injection vulnerability
import subprocess
subprocess.run(f"git clone {user_input}", shell=True) # Dangerous!
# DO: Use safer parameter passing
subprocess.run(["git", "clone", user_input]) # Much better!
Testing for Security: Beyond “Does It Work?”
Security testing is different from regular testing. Instead of asking “Does this work?”, we need to ask “How could this break?” Here’s some toy test examples for validating protection against a security issue:
def test_path_traversal_attempt():
with pytest.raises(ValueError):
process_filename("../../../etc/passwd") # Should reject path traversal
def test_command_injection_attempt():
with pytest.raises(ValueError):
process_command("echo 'hello'; rm -rf /") # Should reject command injection
Secure Configuration Management: The Safe Way to Handle Secrets
One area that bit me recently was configuration management. We had a library that needed API keys for optional features, and initially, we just had users pass them directly as function arguments. Bad idea! Here’s how we fixed it:
from functools import wraps
import os
from typing import Optional
class SecureConfig:
def __init__(self, config_path: Optional[str] = None):
self._config = {}
if config_path:
self._load_from_file(config_path)
self._load_from_env()
def _load_from_file(self, path: str) -> None:
"""Load config from file, with proper permissions checking."""
if not os.path.exists(path):
raise FileNotFoundError(f"Config file not found: {path}")
# Check file permissions (Unix-like systems)
if os.name == 'posix':
stat = os.stat(path)
if stat.st_mode & 0o077: # Check if group/others have any access
raise PermissionError("Config file has unsafe permissions")
with open(path) as f:
self._config.update(json.load(f))
def _load_from_env(self) -> None:
"""Load config from environment variables."""
for key in os.environ:
if key.startswith('MYLIB_'):
self._config[key[6:].lower()] = os.environ[key]
def get(self, key: str, default: Optional[str] = None) -> Optional[str]:
"""Get a config value safely."""
return self._config.get(key, default)
def requires_api_key(f):
"""Decorator to check for required API key."""
@wraps(f)
def wrapper(self, *args, **kwargs):
if not self.config.get('api_key'):
raise ValueError("API key required for this operation")
return f(self, *args, **kwargs)
return wrapper
class MyLibrary:
def __init__(self, config: Optional[SecureConfig] = None):
self.config = config or SecureConfig()
@requires_api_key
def do_api_operation(self, data: str) -> str:
# Use self.config.get('api_key') safely here
return "Result"
This approach gives us several security benefits:
- API keys can be loaded from environment variables (12-factor app style)
- Config files are checked for proper permissions
- Sensitive operations are clearly marked with decorators
- Users have flexibility in how they provide credentials
A Case Study: The Path Not Taken
Let me share an example incident that these practices helped prevent. We have a function that generated temporary files based on user input:
# The original code (DON'T DO THIS!)
def create_temp_file(filename: str) -> str:
path = f"/tmp/{filename}"
with open(path, 'w') as f:
f.write("some data")
return path
# What could go wrong? Well...
create_temp_file("../../../etc/passwd") # Oops! Path traversal
create_temp_file("file; rm -rf /") # Double oops! Command injection
Thanks to our input validation practices, we caught this during code review. Here’s how we fixed it:
import tempfile
import os
from typing import Tuple
def create_temp_file(filename: str) -> Tuple[str, str]:
"""Create a temporary file safely.
Returns:
Tuple[str, str]: (directory path, filename)
"""
# Sanitize the filename
clean_name = sanitize_filename(filename)
if not clean_name:
raise ValueError("Invalid filename")
# Create a secure temporary directory
temp_dir = tempfile.mkdtemp()
# Create the file inside our controlled directory
path = os.path.join(temp_dir, clean_name)
with open(path, 'w') as f:
f.write("some data")
return temp_dir, clean_name
This version:
- Sanitizes the filename
- Uses
tempfile
for secure temporary directory creation - Properly joins paths to prevent directory traversal
- Returns both the directory and filename for proper cleanup
A team using our library later told us they had been targeted by an attacker who tried path traversal attacks. Thanks to these practices, the attempts failed.
The Bottom Line
Security isn’t a feature you can add later—it’s a mindset you need from the start. Every time I write a new function, I try to think: “How could this be misused?” It might seem paranoid, but in the world of security, a little paranoia is a good thing.
Remember: your code might run in contexts you never imagined, processing input you never expected. By following these practices, you’re not just protecting your code—you’re protecting every system that will ever use it.
And if you’re ever tempted to skip security practices because “it’s just a small library,” remember my story from the beginning. In security, there’s no such thing as “just” anything.
Subscribe to the Newsletter
Get the latest posts and insights delivered straight to your inbox.