Computational Foundations for Reliable Agentic Systems
Everything in this framework is about one thing: state control.
For agents to operate reliably, they need:
- State visibility — A clear, accurate view of what state the system is in
- Defined operations — Specific actions with clear domain meaning that apply in clear situations
- State constraints — Tight controls on what states are reachable, even when mistakes happen
- State recovery — The ability to return to a known-good state when something goes wrong
- State history — The agent's own actions become observable state
Every concept in reliable systems—ACID properties, error correction, type systems, checkpointing, logging—is a manifestation of state control under a different guise.
The Central Problem
Agents are probabilistic. They make mistakes. In a long-running task, errors compound: 99% accuracy per decision means only 37% success over 100 decisions.
The question isn't how to prevent all errors. It's: when an error occurs, what happens to state?
- Can you detect the error? (You need to know the current state)
- Can you reason about what to do? (You need operations with clear meaning)
- Can you contain the damage? (Invalid states should be unreachable)
- Can you recover? (You need a path back to valid state)
- Can you understand what happened? (Actions must be recorded)
I. State Visibility
The agent must know what state the system is in.
Why This Matters
Without visibility, agents operate blind. They can't verify their actions succeeded. They can't detect when something went wrong. They make decisions based on stale or incorrect assumptions.
What Visibility Requires
Discrete, enumerable states. If state is continuous or fuzzy, you can't validate it. An order status of {pending, confirmed, shipped, delivered, cancelled} can be checked. An order status of "somewhere in the fulfillment process" cannot.
Observable state. The system provides operations to read current state—not just to modify it. The agent can query: "What state is this entity in? What are its current values?"
Verification after action. After performing an operation, the agent can confirm the expected state change occurred. Not "the operation returned success" but "the state is now what I expected."
Feedback on failure. When operations fail, the agent learns why—not just "error" but "INVALID_STATE: order already shipped" or "PRECONDITION_FAILED: insufficient balance."
The Shannon Connection
Claude Shannon proved that reliable communication requires discretization. If you're sending one of four messages {A, B, C, D} and noise corrupts A slightly, you can still identify A was intended. The discrete structure makes corruption detectable.
The same applies to state. Discrete states create "basins of attraction"—if the system drifts slightly, you can detect it's no longer in a valid state. Fuzzy states hide errors until something breaks.
II. Defined Operations
The agent needs specific actions with clear domain meaning, not generic state manipulation.
Why This Matters
Imagine giving an agent two different interfaces to the same order system:
Generic interface:
update_field(entity, field_name, new_value)Domain-specific interface:
submit_order(order)
confirm_payment(order, payment_id)
ship_order(order, carrier, tracking_number)
cancel_order(order, reason)Both can reach the same states. But with the generic interface, the agent must figure out: "To ship an order, I need to set status to 'shipped' and also set tracking_number and carrier and..." It's reasoning about implementation details, not domain concepts.
With domain-specific operations, the agent reasons in the language of the domain: "The order is confirmed and paid. The next step is to ship it. I'll call ship_order." The operation encapsulates what "shipping" means.
What Defined Operations Provide
Domain vocabulary. Operations are verbs in the domain language: approve, reject, escalate, refund, assign. The agent thinks in terms the business understands.
Clear applicability. Each operation has explicit preconditions that define when it applies:
ship_order(order)
applies when: order.status == CONFIRMED
order.payment.captured == true
order.shipping_address existsThe agent doesn't guess whether shipping makes sense—the preconditions tell it.
Predictable effects. Each operation has defined effects—what changes when it executes:
ship_order(order, carrier, tracking)
effects: order.status = SHIPPED
order.carrier = carrier
order.tracking_number = tracking
order.shipped_at = now()
inventory.decrement(order.items)The agent knows exactly what will happen. No hidden side effects, no surprises.
Composed meaning. Complex workflows become sequences of meaningful steps:
receive_return_request(order) → inspect_return(order) → approve_refund(order)Each step is a domain concept. The agent isn't manipulating fields; it's executing a business process.
Operations vs. Raw State Access
| Raw State Access | Defined Operations |
|---|---|
| Agent reasons about fields and values | Agent reasons about domain actions |
| Must know implementation details | Implementation hidden behind interface |
| Easy to create inconsistent state | Operations maintain consistency |
| Hard to audit ("what did field change X mean?") | Easy to audit ("agent called refund_order") |
| Agent may flail trying combinations | Agent selects from meaningful options |
The Right Granularity
Operations should match domain concepts:
Too fine-grained: set_status(order, 'shipped') — Agent must know all the other things that happen when an order ships.
Too coarse-grained: handle_order(order) — What does "handle" mean? The operation is too vague to reason about.
Right level: ship_order(order, carrier, tracking) — A single domain concept with clear meaning, clear applicability, clear effects.
III. State Constraints
The agent should not be able to put the system into an invalid state, even when it makes mistakes.
Why This Matters
Agents will attempt invalid operations. They'll try to ship cancelled orders, approve their own expenses, create users without email addresses. The question is: what happens when they try?
If invalid operations execute, you get state corruption. If they're rejected at the boundary, the system stays consistent and the agent gets feedback.
What Constraints Look Like
Preconditions on operations. Rules that must hold before an operation can execute.
ship_order(order)
requires: order.status == CONFIRMED
requires: order.shipping_address is not null
requires: order.payment.status == CAPTUREDThe operation itself refuses to execute in inappropriate situations.
Invariants. Properties that must always hold, checked after every state change.
invariant: order.refund_total <= order.payment_total
invariant: account.balance >= 0
invariant: shipped_order.tracking_number is not nullValid transitions. Not every state change is allowed, even through defined operations.
CONFIRMED → SHIPPED (valid via ship_order)
CANCELLED → SHIPPED (invalid, no operation allows this)
DELIVERED → PENDING (invalid, no operation allows this)Type constraints. Values constrained to valid sets.
priority: LOW | MEDIUM | HIGH | CRITICAL (not any string)
amount: PositiveDecimal (not any number)
email: EmailAddress (not any string)The Validation Layer
Like an OS kernel protecting system resources from user programs, a validation layer sits between the agent and system state:
Agent (selects and invokes operations)
↓
Validation Layer (checks preconditions, enforces constraints)
↓
System State (protected)The agent can be sophisticated but unreliable. The validation layer is simple and auditable. Trust is placed in a small, verifiable component that can't be bypassed.
ACID Properties as State Constraints
The database world discovered the same principles:
Atomicity — Operations complete fully or not at all. No partial state changes that leave the system inconsistent.
Consistency — Every operation maintains invariants. The system is never observed in an invalid state.
Isolation — Concurrent operations don't interfere. Two agents can't both read balance=$100 and both withdraw $75.
Durability — Committed state persists through failures. You don't lose state changes to crashes.
These aren't database-specific—they're fundamental properties of reliable state management.
IV. State Recovery
When something goes wrong, the system can return to a known-good state.
Why This Matters
Errors will occur. The question is whether they're recoverable or catastrophic. A system with good recovery properties treats most errors as temporary setbacks. A system without them treats every error as potential corruption.
What Recovery Requires
Checkpointing. Save state at known-good points. When something fails at step 47, you restart from the last checkpoint, not from step 1.
Reversible operations. Operations that can be undone. approve_expense has a corresponding unapprove_expense. soft_delete_record has restore_record. The operation vocabulary includes ways to back out.
Idempotent operations. Operations safe to retry. If you're not sure whether ship_order succeeded, calling it again with the same parameters returns the same result without double-shipping.
Compensating transactions. For operations that can't be truly undone, corresponding operations compensate. Can't un-send an email, but send_correction_email exists. Can't un-charge a card, but issue_refund exists.
The UNKNOWN escape hatch. When the agent can't determine which operation applies, it can say "I don't know" and route to human review. This prevents the agent from randomly trying operations hoping one works.
The DNA Lesson
DNA replication copies 3 billion base pairs per cell division. The raw error rate is about 1 in 10,000. At that rate, you'd get 300,000 errors per division—catastrophic.
The actual error rate is about 1 in a billion. How?
- Proofreading — The polymerase checks each nucleotide immediately after adding it, backing up to fix errors
- Mismatch repair — Separate enzymes scan for errors that slipped through
- Additional repair mechanisms — Various systems catch accumulated damage
This is error correction through state recovery: detect the problem, return to good state, proceed. The same pattern applies to agent systems.
V. State History
The agent's actions become observable state.
Why This Matters
If you can't see what the agent did, you can't debug failures, measure accuracy, or improve the system. The agent's decisions—which operations it invoked and why—are themselves state that must be visible.
What State History Provides
Audit trail. Every operation invocation is recorded: which operation, what inputs, what the outcome was. When something goes wrong, you can reconstruct what happened in domain terms: "The agent called ship_order, then cancel_order, then ship_order again."
Accuracy measurement. If decisions are logged with their context, you can sample them, have humans evaluate correctness, and measure: "Was ship_order the right operation to invoke here?"
Pattern detection. Logged operations reveal patterns: the agent keeps trying operations that fail their preconditions; the agent cycles between two operations without progress; certain operation sequences correlate with failures.
Recovery information. To undo or compensate, you need to know what operations occurred. "We need to reverse the last three operations" requires knowing what they were.
What to Record
For every operation invocation:
- Operation name — Which domain operation was invoked
- Inputs — What parameters were provided
- Precondition state — Did preconditions pass? Which ones failed?
- Outcome — Success, failure, which error
- Resulting state — What state did the system end up in?
- Agent reasoning — Why did the agent select this operation? (if available)
Operations as First-Class Events
The operation log isn't just debugging infrastructure—it's part of the system's state. "What operations has this agent invoked?" is as legitimate a query as "What is this order's status?"
This enables:
- Querying for recent operations by type
- Detecting patterns (agent invoking same operation repeatedly, operation sequences that indicate confusion)
- Rate limiting and budgets (no more than N refund operations per hour)
- Compliance and audit requirements (who authorized this state change?)
Synthesis: The Five Properties
A system is computationally accessible to agents when it provides:
| Property | What It Means | What Enables It |
|---|---|---|
| Visibility | Agent knows current state | Discrete states, read operations, typed feedback |
| Operations | Agent has meaningful actions | Domain verbs with clear preconditions and effects |
| Constraints | Invalid states unreachable | Preconditions, invariants, validation layer, types |
| Recovery | Can return to good state | Checkpoints, reversibility, idempotency, UNKNOWN |
| History | Actions are observable | Logging, audit trails, operation records |
These properties are mutually reinforcing:
- Visibility tells the agent which operations' preconditions are met
- Operations make state changes predictable, aiding visibility
- Constraints ensure operations can't produce states that break visibility
- History records which operations occurred, enabling recovery
- Recovery is expressed through compensating operations
Computational Primitives
The operations a system must provide for agents to maintain state control:
For Visibility
- Read state — Query current values and status
- Verify effects — Confirm an operation had the expected result
- Typed errors — Know why an operation failed
For Operations
- Domain-specific verbs — Actions with business meaning (not generic CRUD)
- Explicit preconditions — When each operation applies
- Defined effects — What each operation changes
- Composable sequences — Operations that chain into workflows
For Constraints
- Precondition enforcement — Operations reject invalid invocations
- Invariant checking — System-wide rules verified after each change
- Atomic execution — Operations complete fully or not at all
- Type validation — Invalid values rejected at the boundary
For Recovery
- Checkpoint/restore — Save and return to known-good state
- Inverse operations — Operations that undo other operations
- Idempotent operations — Safe to retry on uncertainty
- Escalation — Route to human when no operation clearly applies
For History
- Operation logging — Record every invocation with context
- Outcome capture — Store success/failure and resulting state
- Query interface — Retrieve operation history
The Error Compounding Problem
Why state control matters more for agents than for humans:
With 99% per-decision accuracy:
- 10 decisions: 90% overall success
- 100 decisions: 37% overall success
- 500 decisions: 0.7% overall success
Errors compound geometrically. The only escape is error correction—detecting and fixing errors before they propagate.
Error correction requires state control:
- Visibility to detect the error
- Operations to know how to fix it
- Constraints to limit damage
- Recovery to execute the fix
- History to understand what happened
Without state control, agents hit a wall: tasks beyond a certain complexity are guaranteed to fail. With state control, agents can operate indefinitely through continuous correction.
Summary
Reliable agent operation is state control:
Know the state. Discrete, observable, verifiable.
Provide meaningful operations. Domain-specific actions with clear preconditions, clear effects, and clear applicability. The agent reasons about what to do, not how to manipulate fields.
Constrain the state. Invalid states should be unreachable through the operations agents have access to.
Enable recovery. When something goes wrong, return to known-good state through inverse or compensating operations.
Record everything. Operations invoked are state too. Make them visible.
Every other concept—ACID, type systems, validation layers, checkpointing, error correction codes, feedback loops—is one of these five properties applied to a specific context.
Build systems that provide these properties, and agents can operate reliably. Skip any of them, and you're hoping the agent never makes a mistake—a hope that will be disappointed.