Python Logging Best Practices for Library Developers
As someone who’s developed a few Python libraries (category-encoders, elote, etc.), I’ve learned that good logging can make the difference between a library that’s a joy to use and one that makes you want to pull your hair out. Let’s dive into what I’ve learned about logging best practices for Python library developers.
The Golden Rules of Library Logging
Before we get into the nitty-gritty details, let’s establish some fundamental principles:
- Don’t Be a Console Hog: Your library should be seen and not heard (unless explicitly asked)
- Respect the Parent Application: Never configure the root logger
- Make it Configurable: Give users control over logging behavior
- Be Consistent: Follow a clear pattern in your log messages
- Performance Matters: Logging shouldn’t impact your library’s speed
Setting Up Your Logger
The first step is setting up your logger correctly. Here’s the pattern I use in all my libraries:
import logging
# Create a logger specific to your library
logger = logging.getLogger(__name__)
# Don't add any handlers by default
logger.addHandler(logging.NullHandler())
This setup follows the principle of least surprise. The NullHandler
ensures your library won’t emit any logs unless the user configures logging themselves.
Logging Levels: When to Use What
Here’s my opinionated guide on when to use each logging level:
DEBUG: Internal state that’s only useful when troubleshooting
logger.debug("Processing row %d with parameters: %s", row_idx, params)
INFO: Notable but expected events
logger.info("Successfully loaded %d categories from vocabulary", len(categories))
WARNING: Something’s not quite right, but we can handle it
logger.warning("Missing values detected in column '%s', filling with mode", col_name)
ERROR: Something’s wrong and it’s affecting functionality
logger.error("Failed to encode column '%s': %s", col_name, str(e))
CRITICAL: The library can’t function
logger.critical("Unable to load critical resource '%s', cannot continue", resource)
Advanced Patterns
Contextual Logging
When debugging complex issues, context is king. Consider using contextvars or thread-local storage:
from contextvars import ContextVar
request_id = ContextVar('request_id', default=None)
def process_data(data):
logger.debug("Processing data batch", extra={
'request_id': request_id.get(),
'batch_size': len(data)
})
Performance Optimization
Always check log levels before constructing expensive messages:
if logger.isEnabledFor(logging.DEBUG):
logger.debug("Complex calculation result: %s",
expensive_calculation())
Structured Logging
For machine-readable logs, consider using structured logging:
logger.info("Model training complete", extra={
'accuracy': 0.95,
'n_samples': 1000,
'model_type': 'RandomForest'
})
Common Pitfalls to Avoid
- Never Assume Log Destination: Your logs might go to files, cloud services, or nowhere
- Avoid Print Statements: Always use the logging framework
- Don’t Log Sensitive Data: Be careful with user data and credentials
- Be Smart About String Formatting:
- For simple messages, f-strings are fine:
logger.info(f"Processing file {filename}")
- For expensive operations, use lazy formatting:
logger.debug("Complex result: %s", expensive_calculation())
- When using extra context, avoid f-strings:
logger.info("User %(user)s logged in", extra={'user': username})
- For simple messages, f-strings are fine:
- Don’t Silence Errors: Log them properly instead
Testing Your Logging
Don’t forget to test your logging! Here are patterns I use with both unittest and pytest:
Using unittest
import unittest
from unittest.mock import patch
class TestLogging(unittest.TestCase):
def test_warning_on_missing_values(self):
with self.assertLogs('your_library', level='WARNING') as cm:
# Trigger the code that should log
process_data(data_with_missing_values)
# Check the log message
self.assertIn('Missing values detected', cm.output[0])
Using pytest
import pytest
from typing import Generator
from _pytest.logging import LogCaptureFixture
def test_warning_on_missing_values(caplog: LogCaptureFixture) -> None:
# Set the level we want to capture
caplog.set_level('WARNING', logger='your_library')
# Trigger the code that should log
process_data(data_with_missing_values)
# Check the log message
assert 'Missing values detected' in caplog.text
# For more detailed assertions:
assert len(caplog.records) == 1
record = caplog.records[0]
assert record.levelname == 'WARNING'
assert record.name == 'your_library'
@pytest.fixture
def mock_logger() -> Generator[None, None, None]:
"""Fixture to temporarily change log level and restore it after."""
import logging
logger = logging.getLogger('your_library')
original_level = logger.level
logger.setLevel(logging.DEBUG)
yield
logger.setLevel(original_level)
def test_debug_logging_with_context(mock_logger: None, caplog: LogCaptureFixture) -> None:
"""Test debug logging with request context."""
from contextvars import ContextVar
request_id = ContextVar('request_id', default=None)
request_id.set('test-123')
# Set debug level capture
caplog.set_level('DEBUG', logger='your_library')
# Trigger logging
process_data({'test': 'data'})
# Verify log contents and structure
assert 'Processing data batch' in caplog.text
assert any(
record.request_id == 'test-123'
for record in caplog.records
)
The pytest approach offers some nice advantages:
- More pythonic assertions
- Powerful fixtures for setup/teardown
- Better handling of log context and metadata
- Built-in type hints support
- Cleaner syntax for complex test scenarios
Both approaches work well, so choose the one that matches your project’s testing style. I tend to prefer pytest these days for its expressiveness and rich ecosystem of plugins.
Conclusion
Good logging is an art that balances informativeness with unobtrusiveness. By following these practices, you’ll create a library that’s both powerful and pleasant to use. Remember: the best logs are the ones that are there when you need them and invisible when you don’t.
Resources
Subscribe to the Newsletter
Get the latest posts and insights delivered straight to your inbox.