A Full SDLC with MCP Servers

I’ve been building and using MCP servers for about a year now, and at some point I realized they’d quietly grown to cover most of the software development lifecycle. Not in a grand, planned-out way. Each one solved a specific annoyance, and taken together they form something more complete than I intended.

Here’s how they all fit together.

The Setup

When I’m working on a coding project with Claude Code, I typically have some combination of these MCP servers running:

ContextSwitch or todolist-mcp for task management
makefile-mcp for running builds, tests, and linters
mutmut-mcp for mutation testing
hugo-frontmatter-mcp when I’m working on this blog
writing-tools-mcp when I’m writing anything

Not all at once necessarily, but in various combinations depending on what I’m doing. The point is that each one handles a discrete part of the process and the agent orchestrates between them.

Task Management: ContextSwitch and todolist-mcp

Every project starts with tasks. For personal projects and coding sessions, I use one of two tools depending on context.

todolist-mcp is the open-source, cross-platform option. It’s a SQLite-backed todo list with MCP tools for adding, updating, filtering, and completing tasks. It supports dependencies (task B is blocked by task A), priorities, tags, and due dates. There’s also a kanban web UI if you want a visual view.

ContextSwitch is what todolist-mcp grew into: a native macOS app with a built-in MCP server. It supports multiple databases simultaneously, has a timeline view, global search, and a proper native UI. It’s on the Mac App Store.

In practice, an agent with access to either of these can manage its own work queue. When I built Evergreen, I had one subagent acting as a project manager, maintaining the kanban board, while a QA subagent ran checks before any task got marked done. The project manager would pull the next ready task (respecting dependency ordering), the coding agent would implement it, the QA agent would verify it, and the loop continued.

This is where task dependencies really matter. Without them, the agent would try to build the UI before the data model existed, or write tests before the function was implemented. Explicit dependencies give the agent a legitimate plan to follow.

Build and Test: makefile-mcp

Once the agent is working on a task, it needs to run things. makefile-mcp exposes your Makefile targets as MCP tools. If your Makefile has test, lint, format, and build targets, the agent gets make_test, make_lint, make_format, and make_build tools.

This is deliberately simple. I don’t want the agent constructing arbitrary shell commands. I want it using the same targets I’ve already defined and tested. The Makefile is the contract.

The recent addition of output caching makes this much more useful for debugging. When a test suite produces hundreds of lines of output, the agent gets the tail plus an execution_id. It can then search the cached output for specific error messages or paginate through the full log. This is especially helpful for large test suites where the first failure causes a cascade and the relevant error is somewhere in the middle.

Test Quality: mutmut-mcp

Writing tests is one thing. Writing tests that actually catch bugs is another. mutmut-mcp fills this gap with mutation testing.

The workflow is straightforward: mutmut modifies your source code in small ways (changing operators, negating conditions, swapping values) and runs your test suite against each mutation. If your tests still pass with a mutation applied, that’s a “survivor” and it points to a gap in your coverage.

The MCP server wraps this into an agentic loop. The agent runs mutmut, gets the list of survivors, prioritizes them by likely materiality (core logic over logging code), writes targeted tests to kill the high-priority survivors, and verifies the mutants are dead. I’ve watched this loop run hands-off and it genuinely finds real coverage gaps that line-based coverage metrics miss.

Content Management: hugo-frontmatter-mcp

This one’s specific to Hugo static sites, but the pattern is general. hugo-frontmatter-mcp handles all the frontmatter operations on Markdown files: reading fields, setting titles and dates, managing tags and images, and batch operations across directories.

I built it because models kept producing malformed YAML when editing frontmatter directly. Simple as that. Now the model never touches the raw YAML; it calls typed tools that handle serialization correctly. The batch operations (rename a tag across 200 posts, validate all dates match a format) save real time when managing a blog with hundreds of posts.

Writing Quality: writing-tools-mcp

writing-tools-mcp is the one that gets the most use outside of coding. It provides readability analysis, keyword density, spellchecking, passive voice detection, and AI content detection.

The AI detection piece is interesting in this context. When I’m co-writing a post with Claude, I use the stylometric analysis as a self-check. If a section scores as likely AI-generated, I know it needs more of my voice. It’s a feedback loop that helps keep the writing authentic.

For readability, the tools do things that models genuinely struggle with on their own. Character counting, word counting, reading time estimates. These seem trivial but models get them wrong because of tokenization. Having accurate tools for this means the agent can actually optimize for a target reading level or word count.

How It All Fits Together

Here’s what a typical coding session looks like with all of this running:

Agent checks the task board (ContextSwitch/todolist-mcp) for the next ready task
Agent picks up a task and marks it in-progress
Agent implements the change
Agent runs the test suite (makefile-mcp) and fixes any failures
Agent runs the linter and formatter (makefile-mcp)
If it’s a critical path, agent runs mutation testing (mutmut-mcp) and writes additional tests
Agent marks the task done and picks up the next one

For writing sessions, replace steps 3-6 with: draft the content, run readability and keyword analysis, check for passive voice, verify the AI detection score is acceptable, adjust as needed.

The key insight is that none of these servers are particularly complex on their own. They’re thin wrappers around tools I was already using. The power is in giving the agent access to the same development workflow I’d use manually, so it can run the full loop without me doing anything.

The State of Things

All five open-source servers now have proper test suites, GitHub Actions CI, and one-command installation via uvx. They’ve been stable in daily use for months.

If you’re building an agentic coding workflow, you don’t need all of these. Start with makefile-mcp and a task manager. That alone changes how much an agent can accomplish autonomously. Add the others as the needs arise.

Links: