McCabe Complexity: The Python Metric You Should Care About

Exploring Ideas: A Blog on Technology, Startups, Food, and More

Ever looked at a function and thought “this is too complex” but couldn’t quite explain why? Let me introduce you to McCabe complexity - a metric that puts a number on that gut feeling.

What is McCabe Complexity?

McCabe complexity (or cyclomatic complexity) measures the number of linearly independent paths through your code. In simpler terms, it counts the number of decisions your code can make. Each decision point (if statements, loops, etc.) adds to the complexity score.

Why Should You Care?

Let me share a story from my own experience. I was reviewing a PR for one of my open-source projects. The code looked clean at first glance, but something felt off. Running complexity checks revealed a single function with a complexity of 24! No wonder it was hard to understand - there were 24 different paths through that one function.

How It’s Calculated

Let’s look at some examples, starting simple and working our way up:

Example 1: Simple Function (Complexity = 1)

def greet(name):
    return f"Hello, {name}!"

Example 2: One Decision (Complexity = 2)

def greet(name):
    if name:
        return f"Hello, {name}!"
    return "Hello, stranger!"

Example 3: Multiple Decisions (Complexity = 4)

def greet(name, time_of_day):
    if not name:
        name = "stranger"
    
    if time_of_day < 12:
        greeting = "Good morning"
    elif time_of_day < 17:
        greeting = "Good afternoon"
    else:
        greeting = "Good evening"
        
    return f"{greeting}, {name}!"

Real World Example: The Complexity Trap

Here’s a toy example. Can you spot why it’s complex?

def get_valid_name(user_dict):
    return user_dict.get('name') or 'Anonymous'

def get_valid_age(user_dict):
    age = user_dict.get('age')
    return age if isinstance(age, int) and age > 0 else None

def process_user_data(user_dict):
    if not user_dict:
        return None
        
    name = get_valid_name(user_dict)
    age = get_valid_age(user_dict)
    
    if name and age:
        return {'name': name, 'age': age}
    elif name != 'Anonymous':
        return {'name': name}
    else:
        return None

This has a complexity of 10! Here’s how we refactored it:

def get_valid_name(user_dict):
    return user_dict.get('name') or 'Anonymous'

def get_valid_age(user_dict):
    age = user_dict.get('age')
    return age if isinstance(age, int) and age > 0 else None

def process_user_data(user_dict):
    if not user_dict:
        return None
        
    name = get_valid_name(user_dict)
    age = get_valid_age(user_dict)
    
    if name and age:
        return {'name': name, 'age': age}
    elif name != 'Anonymous':
        return {'name': name}
    else:
        return None

Each function now has a complexity of 2-3. Much better! You might say ok, this is doing the same thing, and now is more functions, isn’t that more complex? Not really, because now we really only need to understand one of these at a time. When we’re writing tests we test each of these indepdently, and those tests will be dramatically simpler to write and debug.

Checking Complexity with Ruff

Ruff makes it easy to check McCabe complexity. Here’s how:

# pyproject.toml
[tool.ruff]
select = ["C901"]  # Enable McCabe complexity checks
max-complexity = 10  # Set maximum allowed complexity

Run it:

ruff check .

Every single time I’ve set this up on a legacy project, I’ve immediately regretted it as I was faced with a few nasty refactors. But after a day or two of work, I’ve never really regretted it. Push through the initial burden and enjoy a lifetime of easier maintenance and debugging.

Common Complexity Triggers

Here are the key patterns that often lead to high McCabe complexity, watch out for these in your code:

1. Nested Conditionals

# Bad (Complexity = 4)
def check_status(user):
    if user.is_active:
        if user.has_permission:
            if user.not_blocked:
                return True
    return False

# Good (Complexity = 2)
def check_status(user):
    return all([
        user.is_active,
        user.has_permission,
        user.not_blocked
    ])

2. Multiple Returns

# Bad (Complexity = 5)
def get_discount(user, item):
    if user.is_premium:
        return 0.2
    if item.is_on_sale:
        return 0.15
    if user.is_first_time:
        return 0.1
    if item.is_clearance:
        return 0.25
    return 0

# Good (Complexity = 2)
def get_discount(user, item):
    discounts = [
        0.2 if user.is_premium else 0,
        0.15 if item.is_on_sale else 0,
        0.1 if user.is_first_time else 0,
        0.25 if item.is_clearance else 0,
        0  # default discount
    ]
    # Return first non-zero discount following original priority
    for discount in discounts:
        if discount > 0:
            return discount
    return 0

This one is a fun example because it exposes what most would call a logic error. Can you spot something weird in our discount policy implementation?

3. Exception Handling

# Bad (Complexity = 6)
def parse_config(file_path):
    try:
        with open(file_path) as f:
            data = f.read()
            if data:
                try:
                    config = json.loads(data)
                    if 'settings' in config:
                        return config['settings']
                except json.JSONDecodeError:
                    return None
    except FileNotFoundError:
        return None
    return None

# Good (Complexity = 3)
def read_file(file_path):
    try:
        with open(file_path) as f:
            return f.read()
    except FileNotFoundError:
        return None

def parse_config(file_path):
    data = read_file(file_path)
    if not data:
        return None
        
    try:
        config = json.loads(data)
        return config.get('settings') if 'settings' in config else None
    except json.JSONDecodeError:
        return None

Strategies for Managing Complexity

Break Down Functions
- Each function should do one thing
- Extract complex conditions into helper functions
- Use early returns for guard clauses
Use Data Structures
- Replace complex if/else with dictionaries
- Use sets for membership testing
- Consider enum for state management
Leverage Python Features
- List comprehensions over loops
- all()/any() for multiple conditions
- dict.get() with default values
- The walrus operator (:=) for assignment in expressions

When Is High Complexity OK?

Sometimes complexity is unavoidable. State machines, parsers, and complex business logic might legitimately have higher complexity. The key is to:

Document why the complexity is necessary
Break it into smaller, well-named functions
Add comprehensive tests
Consider using a design pattern

These are special cases that in kind require special attention and care to ensure reliability.

The Bottom Line

McCabe complexity isn’t just a number - it’s a guide to more maintainable code. When you keep functions simple and focused, you write code that’s:

Easier to understand
Easier to test
Easier to maintain
Less likely to contain bugs

Remember: The goal isn’t to blindly reduce complexity scores. It’s to write code that clearly expresses its intent. Sometimes that means breaking down complex logic, and sometimes it means accepting necessary complexity while managing it well.

Subscribe to the Newsletter

Get the latest posts and insights delivered straight to your inbox.