A framework for building reliable AI agents
Agents in production require reliability.Reliability requires error correction.Error correction requires structure.
Humans learn the rules of a domain through experience. What's allowed, what isn't, what combinations make sense. None of this is written down. People just know.
Agents are fluent, not experienced. They'll try things any human would avoid. Without explicit rules to check against, the mistakes go through. That's why the same agent that demos beautifully makes errors and breaks in production.
The Demo-to-Production Gap
Your agent handles most cases well. The rest corrupt state, make wrong calls, or do the wrong thing confidently. The errors are quiet. You find them later.
Errors Compound
99% accuracy per decision means 37% success at 100 decisions. Agents operate faster than humans can review. Errors accumulate before anyone notices.
The Missing Structure
Making the model smarter doesn't solve it. Neither do guardrails. Agents need explicit rules to operate within—and those rules were never written down because humans didn't need them.
The Core Insight
Agents are translators. They convert fuzzy human input into structured system operations. But translation requires something to translate into. Without structure, there's nothing to translate to.
The key to long-running agents isn't making fewer errors—it's detecting and correcting them. 99% accuracy per decision means 37% success at 100 decisions. You can't make error rates low enough. You need errors caught before they compound.
When the rules are explicit, you can catch errors—and fix them. You know what state things are in. Operations can be undone. You can go back to a known good point. Errors get corrected, not compounded.
What structure gives you
- Errors get caught. When the rules are explicit, violations fail immediately instead of passing through silently.
- State is visible. You know what state things are in. You can see what happened and why.
- Recovery is possible. Operations can be undone. You can go back to a known good point. Errors get corrected, not compounded.
The Structure Problem
The complete argument: why agent deployments fail, what structure actually means, and how to build it incrementally. Read this first for the full framework, or explore the essays below for depth on individual topics.
Read the Core EssayA Reading Experience Built for Learning
Highlight & Annotate
Mark passages that resonate. Add your own notes. Build a personal layer of insight on top of the text.
Share & Export
Share specific passages with colleagues. Export your highlights and notes as markdown for your own reference.
Private by Design No account needed
Your data stays on your device. No accounts, no tracking, no cloud sync. Your annotations are yours alone.
Explorations
Eleven essays exploring the framework from different angles. Start with the core essay above, then follow your interests—each exploration stands alone while building on the central argument.
Who This Is For
Engineers
Understand why some agent systems work reliably while others fail. Learn patterns that transfer across domains.
Product Managers
Know what infrastructure needs to exist before agents can succeed. Avoid the demo-to-production trap.
Executives
See where the real competitive advantage lies. Understand why some investments pay off and others don't.
Investors
Identify companies building durable moats. Spot the difference between demos and defensible products.
Start with the Framework
The core essay covers the complete argument in about 11 minutes. Everything else builds on those foundations.
Read the Core Essay