mutmut-mcp: Tests, CI, and Survivor Prioritization
mutmut-mcp wraps the mutmut mutation testing tool so an agent can run mutation tests and act on the results without you doing anything. I wrote about it in the practical MCP use post, where I described the basic workflow: the agent runs mutmut, looks at surviving mutants, writes tests to kill them, validates, and repeats.
That workflow still works well. What didn’t work well was the lack of tests on the MCP server itself. A testing tool with no tests. I know.
What Changed
The server now has 23 tests covering command construction, error handling, virtual environment support, and the survivor prioritization logic. These are mostly mock-based since you don’t want to actually run mutation testing in a unit test, but they verify that the right commands are being built and the right error messages come back.
The project also got GitHub Actions CI, pre-commit hooks, ruff for linting and formatting, and uvx support:
uvx --from git+https://github.com/wdm0006/mutmut-mcp mutmut-mcp
Survivor Prioritization
The feature I still like most about this server is prioritize_survivors. When mutmut finds surviving mutants, you can end up staring at a long list wondering which ones actually matter. This tool scores each survivor by likely materiality using a simple heuristic: mutations in core logic get flagged as high priority, while changes to logging, debug statements, or print calls get deprioritized.
It’s not a sophisticated analysis, but it’s enough to focus your attention. When you’ve got 40 survivors and half of them are in logging setup, knowing you can skip those and focus on the actual business logic saves real time.
The Agentic Loop
The real power here is letting an agent drive the whole thing. The workflow looks like:
- Agent calls
run_mutmuton your module - Agent calls
show_survivorsto see what lived - Agent calls
prioritize_survivorsto rank them - Agent writes new tests targeting the high-priority survivors
- Agent runs the test suite to make sure the new tests pass
- Agent calls
rerun_mutmut_on_survivorto verify the mutants are now killed - Repeat until coverage is solid
I’ve had this loop running hands-off on real codebases and it genuinely improves test quality. The mutations it finds are often things you wouldn’t think to test: off-by-one errors, wrong comparison operators, negated conditions.
Links:
Stay in the loop
Get notified when I publish new posts. No spam, unsubscribe anytime.