Mutation Testing with mumut for Pygeohash

Exploring Ideas: A Blog on Technology, Startups, Food, and More

Ensuring the reliability and correctness of a library like pygeohash is paramount. While traditional code coverage metrics tell us which lines of code our tests execute, they don’t tell us how well those lines are tested. Did our tests actually check the logic, or did they just run through it? This is where mutation testing comes in.

What is Mutation Testing?

Mutation testing is a powerful technique for evaluating the quality of a test suite. The core idea is simple yet effective:

Introduce Small Bugs: The mutation testing tool automatically modifies your source code in small, specific ways, creating “mutants.” These mutations are designed to mimic common programming errors (e.g., changing < to <=, + to -, True to False, modifying constants).
Run Tests: For each mutant, the entire test suite (or a relevant subset) is run.
Analyze Results:
- Killed Mutant: If any test fails, the mutant is considered “killed.” This is good! It means your tests detected the artificially introduced bug.
- Survived Mutant: If all tests pass despite the code change, the mutant “survived.” This indicates a potential weakness in your tests - the code was altered, but no test noticed.
- Error/Timeout: Sometimes mutations cause runtime errors or timeouts, which are also informative but distinct from killed/survived mutants.

The goal is to minimize the number of surviving mutants, thereby increasing confidence that your test suite can catch real regressions.

Introducing `mutmut`

For the pygeohash project, we decided to use mutmut, a popular mutation testing tool for Python. It’s known for its ease of use and integration capabilities.

Setting up `mutmut` for Pygeohash

Integrating mutmut involved a few configuration steps, primarily managed in our pyproject.toml:

Installation: Added mutmut to our development dependencies.
Configuration ([tool.mutmut] section):
- paths_to_mutate: Specified the Python modules within pygeohash/ that we wanted to test (e.g., distances.py, neighbor.py, types.py).
- tests_dir: Pointed to our tests/ directory.
- runner: Defined the command to execute our test suite (make test).
- also_copy: This was crucial for pygeohash. Since we use a compiled C extension (cgeohash), mutmut needed to copy these compiled files (.so) into its temporary testing environment alongside the mutated Python source. We also needed to copy the examples/ directory for certain tests to pass. The configuration looked something like this:
  [tool.mutmut] paths_to_mutate = [ "pygeohash/distances.py", # ... other modules ] tests_dir = "tests" runner = "make test" backup = false also_copy = [ "pygeohash/cgeohash/*", # Copy compiled C extension files "examples", # Copy examples directory ]
Execution: Ran mutmut using uv run python -m mutmut run.

Analyzing the Results

Our initial run yielded results like this (numbers illustrative):

🎉 505 🫥 78  ⏰ 0  🤔 0  🙁 344  🔇 0

🎉 505 Killed: Great! Our tests caught 505 mutants.
🫥 78 Survived: Our main focus. 78 changes went unnoticed by tests.
🙁 344 Error: Many mutations caused runtime errors.

To investigate the survivors, we used the command uv run mutmut show <mutant_id>, which displays the code diff introduced by the specific mutant.

Finding Real Test Improvements

Analyzing the survivors revealed several actionable insights:

1. Missing Boundary Check (is_valid_latitude)

Mutant ID: pygeohash.types.x_is_valid_latitude__mutmut_4
Mutation: Changed return -90 <= float(value) <= 90 to return -91 <= float(value) <= 90.
Problem: This survived because our tests didn’t specifically assert the behavior at the exact lower boundary (-90.0). Allowing -91 didn’t break existing checks.
Fix: We created a new test file (tests/test_types.py) and added parameterized tests specifically including is_valid_latitude(-90.0) == True and is_valid_latitude(-90.000001) == False.

2. Imprecise Error Message Assertion (get_adjacent)

Mutant ID: pygeohash.neighbor.x_get_adjacent__mutmut_9
Mutation: Changed raise ValueError("The geohash length...") to raise ValueError("XXThe geohash length...XX").
Problem: This survived because the existing test used pytest.raises(ValueError, match="cannot be 0"). This substring match still passed even with the extra “XX” characters added by mutmut.
Fix: We added a new test (test_zero_length_geohash in tests/test_neighbor.py) that asserts the exact error message using an anchored regex: pytest.raises(ValueError, match="^The geohash length cannot be 0. Possible when close to poles$").

We also identified several survivors that only affected logging statements. While important for debugging, these aren’t typically covered by functional tests. We marked these lines with # pragma: no mutate to exclude them from future runs.

Conclusion

Mutation testing with mutmut proved to be a valuable exercise for pygeohash. While code coverage showed high numbers, mutmut helped us pinpoint specific weaknesses in our test suite - missing boundary checks and assertions that weren’t precise enough. By analyzing the surviving mutants, we were able to add targeted tests and refine existing ones, ultimately increasing our confidence in the library’s reliability. It’s a worthwhile technique to incorporate, even occasionally, to go beyond simple coverage and truly evaluate test suite effectiveness.

Subscribe to the Newsletter

Get the latest posts and insights delivered straight to your inbox.

Mutation Testing with mumut for Pygeohash

What is Mutation Testing?

Introducing mutmut

Setting up mutmut for Pygeohash

Analyzing the Results

Finding Real Test Improvements

Conclusion

Subscribe to the Newsletter

Introducing `mutmut`

Setting up `mutmut` for Pygeohash