Mutation Testing with mumut for Pygeohash
Ensuring the reliability and correctness of a library like pygeohash
is paramount. While traditional code coverage metrics tell us which lines of code our tests execute, they don’t tell us how well those lines are tested. Did our tests actually check the logic, or did they just run through it? This is where mutation testing comes in.
What is Mutation Testing?
Mutation testing is a powerful technique for evaluating the quality of a test suite. The core idea is simple yet effective:
- Introduce Small Bugs: The mutation testing tool automatically modifies your source code in small, specific ways, creating “mutants.” These mutations are designed to mimic common programming errors (e.g., changing
<
to<=
,+
to-
,True
toFalse
, modifying constants). - Run Tests: For each mutant, the entire test suite (or a relevant subset) is run.
- Analyze Results:
- Killed Mutant: If any test fails, the mutant is considered “killed.” This is good! It means your tests detected the artificially introduced bug.
- Survived Mutant: If all tests pass despite the code change, the mutant “survived.” This indicates a potential weakness in your tests β the code was altered, but no test noticed.
- Error/Timeout: Sometimes mutations cause runtime errors or timeouts, which are also informative but distinct from killed/survived mutants.
The goal is to minimize the number of surviving mutants, thereby increasing confidence that your test suite can catch real regressions.
Introducing mutmut
For the pygeohash
project, we decided to use mutmut
, a popular mutation testing tool for Python. It’s known for its ease of use and integration capabilities.
Setting up mutmut
for Pygeohash
Integrating mutmut
involved a few configuration steps, primarily managed in our pyproject.toml
:
- Installation: Added
mutmut
to our development dependencies. - Configuration (
[tool.mutmut]
section):paths_to_mutate
: Specified the Python modules withinpygeohash/
that we wanted to test (e.g.,distances.py
,neighbor.py
,types.py
).tests_dir
: Pointed to ourtests/
directory.runner
: Defined the command to execute our test suite (make test
).also_copy
: This was crucial forpygeohash
. Since we use a compiled C extension (cgeohash
),mutmut
needed to copy these compiled files (.so
) into its temporary testing environment alongside the mutated Python source. We also needed to copy theexamples/
directory for certain tests to pass. The configuration looked something like this:[tool.mutmut] paths_to_mutate = [ "pygeohash/distances.py", # ... other modules ] tests_dir = "tests" runner = "make test" backup = false also_copy = [ "pygeohash/cgeohash/*", # Copy compiled C extension files "examples", # Copy examples directory ]
- Execution: Ran
mutmut
usinguv run python -m mutmut run
.
Analyzing the Results
Our initial run yielded results like this (numbers illustrative):
π 505 π«₯ 78 β° 0 π€ 0 π 344 π 0
π 505 Killed
: Great! Our tests caught 505 mutants.π«₯ 78 Survived
: Our main focus. 78 changes went unnoticed by tests.π 344 Error
: Many mutations caused runtime errors.
To investigate the survivors, we used the command uv run mutmut show <mutant_id>
, which displays the code diff introduced by the specific mutant.
Finding Real Test Improvements
Analyzing the survivors revealed several actionable insights:
1. Missing Boundary Check (is_valid_latitude
)
- Mutant ID:
pygeohash.types.x_is_valid_latitude__mutmut_4
- Mutation: Changed
return -90 <= float(value) <= 90
toreturn -91 <= float(value) <= 90
. - Problem: This survived because our tests didn’t specifically assert the behavior at the exact lower boundary (
-90.0
). Allowing-91
didn’t break existing checks. - Fix: We created a new test file (
tests/test_types.py
) and added parameterized tests specifically includingis_valid_latitude(-90.0) == True
andis_valid_latitude(-90.000001) == False
.
2. Imprecise Error Message Assertion (get_adjacent
)
- Mutant ID:
pygeohash.neighbor.x_get_adjacent__mutmut_9
- Mutation: Changed
raise ValueError("The geohash length...")
toraise ValueError("XXThe geohash length...XX")
. - Problem: This survived because the existing test used
pytest.raises(ValueError, match="cannot be 0")
. This substring match still passed even with the extra “XX” characters added bymutmut
. - Fix: We added a new test (
test_zero_length_geohash
intests/test_neighbor.py
) that asserts the exact error message using an anchored regex:pytest.raises(ValueError, match="^The geohash length cannot be 0. Possible when close to poles$")
.
We also identified several survivors that only affected logging statements. While important for debugging, these aren’t typically covered by functional tests. We marked these lines with # pragma: no mutate
to exclude them from future runs.
Conclusion
Mutation testing with mutmut
proved to be a valuable exercise for pygeohash
. While code coverage showed high numbers, mutmut
helped us pinpoint specific weaknesses in our test suite β missing boundary checks and assertions that weren’t precise enough. By analyzing the surviving mutants, we were able to add targeted tests and refine existing ones, ultimately increasing our confidence in the library’s reliability. It’s a worthwhile technique to incorporate, even occasionally, to go beyond simple coverage and truly evaluate test suite effectiveness.
Subscribe to the Newsletter
Get the latest posts and insights delivered straight to your inbox.