A framework for building reliable AI agents

Agents in production require reliability.Reliability requires error correction.Error correction requires structure.

Humans learn the rules of a domain through experience. What's allowed, what isn't, what combinations make sense. None of this is written down. People just know.

Agents are fluent, not experienced. They'll try things any human would avoid. Without explicit rules to check against, the mistakes go through. That's why the same agent that demos beautifully makes errors and breaks in production.

The Demo-to-Production Gap

Your agent handles most cases well. The rest corrupt state, make wrong calls, or do the wrong thing confidently. The errors are quiet. You find them later.

Errors Compound

99% accuracy per decision means 37% success at 100 decisions. Agents operate faster than humans can review. Errors accumulate before anyone notices.

The Missing Structure

Making the model smarter doesn't solve it. Neither do guardrails. Agents need explicit rules to operate within—and those rules were never written down because humans didn't need them.

The Core Insight

Agents are translators. They convert fuzzy human input into structured system operations. But translation requires something to translate into. Without structure, there's nothing to translate to.

The key to long-running agents isn't making fewer errors—it's detecting and correcting them. 99% accuracy per decision means 37% success at 100 decisions. You can't make error rates low enough. You need errors caught before they compound.

When the rules are explicit, you can catch errors—and fix them. You know what state things are in. Operations can be undone. You can go back to a known good point. Errors get corrected, not compounded.

What structure gives you

  • Errors get caught. When the rules are explicit, violations fail immediately instead of passing through silently.
  • State is visible. You know what state things are in. You can see what happened and why.
  • Recovery is possible. Operations can be undone. You can go back to a known good point. Errors get corrected, not compounded.
Start Here~11 min read

The Structure Problem

The complete argument: why agent deployments fail, what structure actually means, and how to build it incrementally. Read this first for the full framework, or explore the essays below for depth on individual topics.

Read the Core Essay

A Reading Experience Built for Learning

Highlight & Annotate

Mark passages that resonate. Add your own notes. Build a personal layer of insight on top of the text.

Share & Export

Share specific passages with colleagues. Export your highlights and notes as markdown for your own reference.

Private by Design No account needed

Your data stays on your device. No accounts, no tracking, no cloud sync. Your annotations are yours alone.

Explorations

Eleven essays exploring the framework from different angles. Start with the core essay above, then follow your interests—each exploration stands alone while building on the central argument.

Who This Is For

Engineers

Understand why some agent systems work reliably while others fail. Learn patterns that transfer across domains.

Product Managers

Know what infrastructure needs to exist before agents can succeed. Avoid the demo-to-production trap.

Executives

See where the real competitive advantage lies. Understand why some investments pay off and others don't.

Investors

Identify companies building durable moats. Spot the difference between demos and defensible products.

Start with the Framework

The core essay covers the complete argument in about 11 minutes. Everything else builds on those foundations.

Read the Core Essay