From Theory to Practice: Building Real Decision Models with Petersburg

Exploring Ideas: A Blog on Technology, Startups, Food, and More

Last year I wrote about decision making under uncertainty, exploring how expected value calculations break down when facing real-world complexity. I mentioned at the end that I’d built some cool simulations using a project called petersburg, but I didn’t have the images handy and never followed up.

Well, I’ve spent some time between JPMC and SVP jobs cleaning up that project, adding examples and case studies, and I think it’s worth revisiting. Not just because the code is cleaner, but because I’ve realized that having a practical tool for modeling these decisions changes how you think about them.

The Gap Between Theory and Practice

It’s one thing to talk abstractly about decision trees, probabilistic outcomes, and the St. Petersburg Paradox. It’s another thing entirely to sit down with a real decision, like whether to use a third-party vendor or build in-house, and actually model it.

The problem is that most real decisions have:

Multiple stages with dependencies
Uncertain probabilities you have to guess at
Non-linear costs that change over time
Feedback loops and compounding effects

You can’t just draw a simple tree on a whiteboard. Or rather, you can, but then you realize you’ve made a dozen assumptions and you have no idea which ones actually matter.

Enter Petersburg

Petersburg is a Python framework I built many years ago for modeling decision processes as directed graphs with automatic sensitivity analysis. The core idea is progressive refinement: start with a simple model and incrementally add complexity as you understand the problem better.

Here’s what it does:

Models decisions as directed acyclic graphs
Supports probabilistic edge weights and uncertainty distributions
Runs Monte Carlo simulations to generate outcome distributions
Automatically performs sensitivity analysis to show which parameters matter most
Visualizes everything with Mermaid diagrams

A Real Example: Build vs. Buy

Let me walk through the exact problem from my 2024 post: deciding whether to build a feature in-house or use a third-party vendor. This is the kind of decision every startup faces, and it’s thornier than it looks because the costs compound over time.

The setup: you need a new capability. You can build it yourself or use a vendor’s solution. If the vendor fails or doesn’t work out, you’ll have to rebuild in-house anyway, but by then your codebase has grown more coupled to the vendor and the reversion cost is higher.

Starting Simple

First iteration: just model the basic paths.

from petersburg import Petersburg

# Define the decision states
graph = Petersburg()
graph.add_node("start", payoff=0)
graph.add_node("build_inhouse", payoff=0)
graph.add_node("use_vendor", payoff=0)
graph.add_node("vendor_success", payoff=100)
graph.add_node("vendor_fails", payoff=0)
graph.add_node("rebuild_after_failure", payoff=50)

# Initial choice - build costs more upfront, vendor is faster
graph.add_edge("start", "build_inhouse",
    probability=1.0, cost=50)
graph.add_edge("start", "use_vendor",
    probability=1.0, cost=20)

# Vendor path - maybe it works, maybe it doesn't
graph.add_edge("use_vendor", "vendor_success",
    probability=0.7, cost=0)
graph.add_edge("use_vendor", "vendor_fails",
    probability=0.3, cost=0)

# If vendor fails, we rebuild (but now it's more expensive)
graph.add_edge("vendor_fails", "rebuild_after_failure",
    probability=1.0, cost=80)

# In-house path gets full value
graph.set_node_payoff("build_inhouse", 100)

This is the mental model most people use. But it’s missing something critical: the cost of rebuilding increases with time.

Adding the Time Dimension

Second iteration: model how the reversion cost grows as your product matures.

from petersburg.distributions import GaussianNode

# Early in product lifecycle, can salvage 50% of vendor work
early_salvage_rate = 0.5
# Late in lifecycle, maybe only 10%
late_salvage_rate = 0.1

# Model the actual rebuild cost based on salvage rate
early_rebuild_cost = 50 * (1 - early_salvage_rate)
late_rebuild_cost = 50 * (1 - late_salvage_rate)

# Add uncertainty about when failure occurs
graph.add_edge("vendor_fails", "rebuild_after_failure",
    probability=1.0,
    cost=GaussianNode(mean=late_rebuild_cost, std=10))

Now we’re capturing the reality that the longer you’re on the vendor, the more expensive it is to leave. This non-linearity is what makes the decision hard.

Finding What Matters

Third iteration: run sensitivity analysis to see what actually drives the decision.

results = graph.simulate(n_iterations=10_000)
sensitivity = graph.sensitivity_analysis()

# Which parameters most affect the decision?
print(sensitivity.top_drivers())

Running this across different scenarios revealed some non-obvious insights:

Salvage rate dominates: If you can reuse >50% of the vendor integration work, the vendor path is almost always better. Below 50%, in-house starts winning quickly.
Failure probability is surprisingly forgiving: Even at 40% failure rates, the vendor path can work if salvage rates are high.
Initial cost difference matters less than you think: Saving 20 units upfront doesn’t matter much if rebuild costs 80 units later.

The key insight from the sensitivity analysis was that I’d been focusing on the wrong variable. Instead of obsessing over “what’s the probability the vendor fails?”, I should have been asking “how can we architect this to maximize salvageable work?”

That’s an actionable question with a different set of solutions.

What I’ve Learned About Decision Modeling

Working with petersburg has changed how I think about decisions in a few ways:

1. Start Dumber Than You Think

My first instinct is always to build a sophisticated model that captures every nuance. This is wrong. Start with something embarrassingly simple, maybe just three nodes and two edges. Run it. See what happens. You’ll immediately notice what’s missing.

2. Uncertainty About Uncertainty

The most valuable thing to model isn’t the base case it’s your uncertainty about the parameters. If you think there’s a 40% chance of success, but you’re really uncertain about that, model it as a distribution centered at 40%. The shape of that distribution often matters more than the center.

3. Sensitivity Analysis Is the Point

The goal isn’t to predict the future. It’s to understand which factors drive the outcome. If your decision would be the same across a wide range of parameter values, great! You can stop worrying about getting those estimates perfect. If it’s highly sensitive to one assumption, that’s where you should focus your research.

4. Emergent Behavior from Simple Rules

One of the coolest things about petersburg is watching complex patterns emerge from simple binary transitions. You don’t have to explicitly model second-order effects, they emerge naturally from the structure of the graph and the distributions you use.

Other Case Studies

The repository now includes several other worked examples:

Startup Funding Journey: Models the progression through funding rounds with realistic success rates. Demonstrates how power law distributions (most startups fail, a few succeed massively) emerge naturally from simple stage-by-stage filtering.

Drug Development Pipeline: Models the progression through clinical trials with realistic failure rates at each phase. Helps pharma companies decide which compounds to advance.

Product Launch: Captures the uncertainty cascade of market testing, feature development, and go-to-market timing decisions.

Litigation Strategy: Models settlement vs. trial decisions with uncertain outcomes and non-linear costs.

Each one follows the same pattern: start simple, add uncertainty, run sensitivity analysis, get insights you wouldn’t have had otherwise.

When This Actually Helps

I don’t use petersburg for most decisions. Making a simple model when the decision is obvious is procrastination dressed up as rigor. But it’s been genuinely useful when:

The decision has multiple stages and dependencies
There’s genuine uncertainty about key parameters
The costs or payoffs are non-linear
Different stakeholders have different intuitions about what matters
You need to communicate why you’re choosing a particular path

That last one is important. A petersburg model with sensitivity analysis is much more persuasive than “I think we should do X because my gut says so.” It forces you to make your assumptions explicit and shows which ones actually drive the decision.

Try It Yourself

The code is all on GitHub with examples and case studies. If you’ve got a gnarly decision you’re facing - especially one with multiple stages and uncertain outcomes - try modeling it.

You might find that the process of building the model teaches you more than the simulation results. That’s fine. The point isn’t to have a crystal ball. It’s to think more clearly about complex problems.

And if you do build something interesting, let me know. I’m always curious to see what kinds of decisions people are modeling.

Subscribe to the Newsletter

Get the latest posts and insights delivered straight to your inbox.