Python Logging Best Practices for Library Developers

As someone who’s developed a few Python libraries (category-encoders, elote, etc.), I’ve learned that good logging can make the difference between a library that’s a joy to use and one that makes you want to pull your hair out. Let’s dive into what I’ve learned about logging best practices for Python library developers.

The Golden Rules of Library Logging

Before we get into the nitty-gritty details, let’s establish some fundamental principles:

  1. Don’t Be a Console Hog: Your library should be seen and not heard (unless explicitly asked)
  2. Respect the Parent Application: Never configure the root logger
  3. Make it Configurable: Give users control over logging behavior
  4. Be Consistent: Follow a clear pattern in your log messages
  5. Performance Matters: Logging shouldn’t impact your library’s speed

Setting Up Your Logger

The first step is setting up your logger correctly. Here’s the pattern I use in all my libraries:

import logging

# Create a logger specific to your library
logger = logging.getLogger(__name__)

# Don't add any handlers by default
logger.addHandler(logging.NullHandler())

This setup follows the principle of least surprise. The NullHandler ensures your library won’t emit any logs unless the user configures logging themselves.

Logging Levels: When to Use What

Here’s my opinionated guide on when to use each logging level:

  • DEBUG: Internal state that’s only useful when troubleshooting

    logger.debug("Processing row %d with parameters: %s", row_idx, params)
  • INFO: Notable but expected events

    logger.info("Successfully loaded %d categories from vocabulary", len(categories))
  • WARNING: Something’s not quite right, but we can handle it

    logger.warning("Missing values detected in column '%s', filling with mode", col_name)
  • ERROR: Something’s wrong and it’s affecting functionality

    logger.error("Failed to encode column '%s': %s", col_name, str(e))
  • CRITICAL: The library can’t function

    logger.critical("Unable to load critical resource '%s', cannot continue", resource)

Advanced Patterns

Contextual Logging

When debugging complex issues, context is king. Consider using contextvars or thread-local storage:

from contextvars import ContextVar

request_id = ContextVar('request_id', default=None)

def process_data(data):
    logger.debug("Processing data batch", extra={
        'request_id': request_id.get(),
        'batch_size': len(data)
    })

Performance Optimization

Always check log levels before constructing expensive messages:

if logger.isEnabledFor(logging.DEBUG):
    logger.debug("Complex calculation result: %s", 
                 expensive_calculation())

Structured Logging

For machine-readable logs, consider using structured logging:

logger.info("Model training complete", extra={
    'accuracy': 0.95,
    'n_samples': 1000,
    'model_type': 'RandomForest'
})

Common Pitfalls to Avoid

  1. Never Assume Log Destination: Your logs might go to files, cloud services, or nowhere
  2. Avoid Print Statements: Always use the logging framework
  3. Don’t Log Sensitive Data: Be careful with user data and credentials
  4. Be Smart About String Formatting:
    • For simple messages, f-strings are fine: logger.info(f"Processing file {filename}")
    • For expensive operations, use lazy formatting: logger.debug("Complex result: %s", expensive_calculation())
    • When using extra context, avoid f-strings: logger.info("User %(user)s logged in", extra={'user': username})
  5. Don’t Silence Errors: Log them properly instead

Testing Your Logging

Don’t forget to test your logging! Here are patterns I use with both unittest and pytest:

Using unittest

import unittest
from unittest.mock import patch

class TestLogging(unittest.TestCase):
    def test_warning_on_missing_values(self):
        with self.assertLogs('your_library', level='WARNING') as cm:
            # Trigger the code that should log
            process_data(data_with_missing_values)
            
            # Check the log message
            self.assertIn('Missing values detected', cm.output[0])

Using pytest

import pytest
from typing import Generator
from _pytest.logging import LogCaptureFixture

def test_warning_on_missing_values(caplog: LogCaptureFixture) -> None:
    # Set the level we want to capture
    caplog.set_level('WARNING', logger='your_library')
    
    # Trigger the code that should log
    process_data(data_with_missing_values)
    
    # Check the log message
    assert 'Missing values detected' in caplog.text
    
    # For more detailed assertions:
    assert len(caplog.records) == 1
    record = caplog.records[0]
    assert record.levelname == 'WARNING'
    assert record.name == 'your_library'

@pytest.fixture
def mock_logger() -> Generator[None, None, None]:
    """Fixture to temporarily change log level and restore it after."""
    import logging
    logger = logging.getLogger('your_library')
    original_level = logger.level
    logger.setLevel(logging.DEBUG)
    yield
    logger.setLevel(original_level)

def test_debug_logging_with_context(mock_logger: None, caplog: LogCaptureFixture) -> None:
    """Test debug logging with request context."""
    from contextvars import ContextVar
    request_id = ContextVar('request_id', default=None)
    request_id.set('test-123')
    
    # Set debug level capture
    caplog.set_level('DEBUG', logger='your_library')
    
    # Trigger logging
    process_data({'test': 'data'})
    
    # Verify log contents and structure
    assert 'Processing data batch' in caplog.text
    assert any(
        record.request_id == 'test-123' 
        for record in caplog.records
    )

The pytest approach offers some nice advantages:

  • More pythonic assertions
  • Powerful fixtures for setup/teardown
  • Better handling of log context and metadata
  • Built-in type hints support
  • Cleaner syntax for complex test scenarios

Both approaches work well, so choose the one that matches your project’s testing style. I tend to prefer pytest these days for its expressiveness and rich ecosystem of plugins.

Conclusion

Good logging is an art that balances informativeness with unobtrusiveness. By following these practices, you’ll create a library that’s both powerful and pleasant to use. Remember: the best logs are the ones that are there when you need them and invisible when you don’t.

Resources

Subscribe to the Newsletter

Get the latest posts and insights delivered straight to your inbox.