Computational Foundations for Reliable Agentic Systems

Everything in this framework is about one thing: state control.

For agents to operate reliably, they need:

State visibility — A clear, accurate view of what state the system is in
Defined operations — Specific actions with clear domain meaning that apply in clear situations
State constraints — Tight controls on what states are reachable, even when mistakes happen
State recovery — The ability to return to a known-good state when something goes wrong
State history — The agent's own actions become observable state

Every concept in reliable systems—ACID properties, error correction, type systems, checkpointing, logging—is a manifestation of state control under a different guise.

The Central Problem

Agents are probabilistic. They make mistakes. In a long-running task, errors compound: 99% accuracy per decision means only 37% success over 100 decisions.

The question isn't how to prevent all errors. It's: when an error occurs, what happens to state?

Can you detect the error? (You need to know the current state)
Can you reason about what to do? (You need operations with clear meaning)
Can you contain the damage? (Invalid states should be unreachable)
Can you recover? (You need a path back to valid state)
Can you understand what happened? (Actions must be recorded)

I. State Visibility

The agent must know what state the system is in.

Why This Matters

Without visibility, agents operate blind. They can't verify their actions succeeded. They can't detect when something went wrong. They make decisions based on stale or incorrect assumptions.

What Visibility Requires

Discrete, enumerable states. If state is continuous or fuzzy, you can't validate it. An order status of {pending, confirmed, shipped, delivered, cancelled} can be checked. An order status of "somewhere in the fulfillment process" cannot.

Observable state. The system provides operations to read current state—not just to modify it. The agent can query: "What state is this entity in? What are its current values?"

Verification after action. After performing an operation, the agent can confirm the expected state change occurred. Not "the operation returned success" but "the state is now what I expected."

Feedback on failure. When operations fail, the agent learns why—not just "error" but "INVALID_STATE: order already shipped" or "PRECONDITION_FAILED: insufficient balance."

The Shannon Connection

Claude Shannon proved that reliable communication requires discretization. If you're sending one of four messages {A, B, C, D} and noise corrupts A slightly, you can still identify A was intended. The discrete structure makes corruption detectable.

The same applies to state. Discrete states create "basins of attraction"—if the system drifts slightly, you can detect it's no longer in a valid state. Fuzzy states hide errors until something breaks.

II. Defined Operations

The agent needs specific actions with clear domain meaning, not generic state manipulation.

Why This Matters

Imagine giving an agent two different interfaces to the same order system:

Generic interface:

update_field(entity, field_name, new_value)

Domain-specific interface:

submit_order(order)
confirm_payment(order, payment_id)
ship_order(order, carrier, tracking_number)
cancel_order(order, reason)

Both can reach the same states. But with the generic interface, the agent must figure out: "To ship an order, I need to set status to 'shipped' and also set tracking_number and carrier and..." It's reasoning about implementation details, not domain concepts.

With domain-specific operations, the agent reasons in the language of the domain: "The order is confirmed and paid. The next step is to ship it. I'll call ship_order." The operation encapsulates what "shipping" means.

What Defined Operations Provide

Domain vocabulary. Operations are verbs in the domain language: approve, reject, escalate, refund, assign. The agent thinks in terms the business understands.

Clear applicability. Each operation has explicit preconditions that define when it applies:

ship_order(order)
  applies when: order.status == CONFIRMED
                order.payment.captured == true
                order.shipping_address exists

The agent doesn't guess whether shipping makes sense—the preconditions tell it.

Predictable effects. Each operation has defined effects—what changes when it executes:

ship_order(order, carrier, tracking)
  effects: order.status = SHIPPED
           order.carrier = carrier
           order.tracking_number = tracking
           order.shipped_at = now()
           inventory.decrement(order.items)

The agent knows exactly what will happen. No hidden side effects, no surprises.

Composed meaning. Complex workflows become sequences of meaningful steps:

receive_return_request(order)  →  inspect_return(order)  →  approve_refund(order)

Each step is a domain concept. The agent isn't manipulating fields; it's executing a business process.

Operations vs. Raw State Access

Raw State Access	Defined Operations
Agent reasons about fields and values	Agent reasons about domain actions
Must know implementation details	Implementation hidden behind interface
Easy to create inconsistent state	Operations maintain consistency
Hard to audit ("what did field change X mean?")	Easy to audit ("agent called refund_order")
Agent may flail trying combinations	Agent selects from meaningful options

The Right Granularity

Operations should match domain concepts:

Too fine-grained: set_status(order, 'shipped') — Agent must know all the other things that happen when an order ships.

Too coarse-grained: handle_order(order) — What does "handle" mean? The operation is too vague to reason about.

Right level: ship_order(order, carrier, tracking) — A single domain concept with clear meaning, clear applicability, clear effects.

III. State Constraints

The agent should not be able to put the system into an invalid state, even when it makes mistakes.

Why This Matters

Agents will attempt invalid operations. They'll try to ship cancelled orders, approve their own expenses, create users without email addresses. The question is: what happens when they try?

If invalid operations execute, you get state corruption. If they're rejected at the boundary, the system stays consistent and the agent gets feedback.

What Constraints Look Like

Preconditions on operations. Rules that must hold before an operation can execute.

ship_order(order)
  requires: order.status == CONFIRMED
  requires: order.shipping_address is not null
  requires: order.payment.status == CAPTURED

The operation itself refuses to execute in inappropriate situations.

Invariants. Properties that must always hold, checked after every state change.

invariant: order.refund_total <= order.payment_total
invariant: account.balance >= 0
invariant: shipped_order.tracking_number is not null

Valid transitions. Not every state change is allowed, even through defined operations.

CONFIRMED → SHIPPED (valid via ship_order)
CANCELLED → SHIPPED (invalid, no operation allows this)
DELIVERED → PENDING (invalid, no operation allows this)

Type constraints. Values constrained to valid sets.

priority: LOW | MEDIUM | HIGH | CRITICAL  (not any string)
amount: PositiveDecimal                    (not any number)
email: EmailAddress                        (not any string)

The Validation Layer

Like an OS kernel protecting system resources from user programs, a validation layer sits between the agent and system state:

Agent (selects and invokes operations)
         ↓
Validation Layer (checks preconditions, enforces constraints)
         ↓
System State (protected)

The agent can be sophisticated but unreliable. The validation layer is simple and auditable. Trust is placed in a small, verifiable component that can't be bypassed.

ACID Properties as State Constraints

The database world discovered the same principles:

Atomicity — Operations complete fully or not at all. No partial state changes that leave the system inconsistent.

Consistency — Every operation maintains invariants. The system is never observed in an invalid state.

Isolation — Concurrent operations don't interfere. Two agents can't both read balance=$100 and both withdraw $75.

Durability — Committed state persists through failures. You don't lose state changes to crashes.

These aren't database-specific—they're fundamental properties of reliable state management.

IV. State Recovery

When something goes wrong, the system can return to a known-good state.

Why This Matters

Errors will occur. The question is whether they're recoverable or catastrophic. A system with good recovery properties treats most errors as temporary setbacks. A system without them treats every error as potential corruption.

What Recovery Requires

Checkpointing. Save state at known-good points. When something fails at step 47, you restart from the last checkpoint, not from step 1.

Reversible operations. Operations that can be undone. approve_expense has a corresponding unapprove_expense. soft_delete_record has restore_record. The operation vocabulary includes ways to back out.

Idempotent operations. Operations safe to retry. If you're not sure whether ship_order succeeded, calling it again with the same parameters returns the same result without double-shipping.

Compensating transactions. For operations that can't be truly undone, corresponding operations compensate. Can't un-send an email, but send_correction_email exists. Can't un-charge a card, but issue_refund exists.

The UNKNOWN escape hatch. When the agent can't determine which operation applies, it can say "I don't know" and route to human review. This prevents the agent from randomly trying operations hoping one works.

The DNA Lesson

DNA replication copies 3 billion base pairs per cell division. The raw error rate is about 1 in 10,000. At that rate, you'd get 300,000 errors per division—catastrophic.

The actual error rate is about 1 in a billion. How?

Proofreading — The polymerase checks each nucleotide immediately after adding it, backing up to fix errors
Mismatch repair — Separate enzymes scan for errors that slipped through
Additional repair mechanisms — Various systems catch accumulated damage

This is error correction through state recovery: detect the problem, return to good state, proceed. The same pattern applies to agent systems.

V. State History

The agent's actions become observable state.

Why This Matters

If you can't see what the agent did, you can't debug failures, measure accuracy, or improve the system. The agent's decisions—which operations it invoked and why—are themselves state that must be visible.

What State History Provides

Audit trail. Every operation invocation is recorded: which operation, what inputs, what the outcome was. When something goes wrong, you can reconstruct what happened in domain terms: "The agent called ship_order, then cancel_order, then ship_order again."

Accuracy measurement. If decisions are logged with their context, you can sample them, have humans evaluate correctness, and measure: "Was ship_order the right operation to invoke here?"

Pattern detection. Logged operations reveal patterns: the agent keeps trying operations that fail their preconditions; the agent cycles between two operations without progress; certain operation sequences correlate with failures.

Recovery information. To undo or compensate, you need to know what operations occurred. "We need to reverse the last three operations" requires knowing what they were.

What to Record

For every operation invocation:

Operation name — Which domain operation was invoked
Inputs — What parameters were provided
Precondition state — Did preconditions pass? Which ones failed?
Outcome — Success, failure, which error
Resulting state — What state did the system end up in?
Agent reasoning — Why did the agent select this operation? (if available)

Operations as First-Class Events

The operation log isn't just debugging infrastructure—it's part of the system's state. "What operations has this agent invoked?" is as legitimate a query as "What is this order's status?"

This enables:

Querying for recent operations by type
Detecting patterns (agent invoking same operation repeatedly, operation sequences that indicate confusion)
Rate limiting and budgets (no more than N refund operations per hour)
Compliance and audit requirements (who authorized this state change?)

Synthesis: The Five Properties

A system is computationally accessible to agents when it provides:

Property	What It Means	What Enables It
Visibility	Agent knows current state	Discrete states, read operations, typed feedback
Operations	Agent has meaningful actions	Domain verbs with clear preconditions and effects
Constraints	Invalid states unreachable	Preconditions, invariants, validation layer, types
Recovery	Can return to good state	Checkpoints, reversibility, idempotency, UNKNOWN
History	Actions are observable	Logging, audit trails, operation records

These properties are mutually reinforcing:

Visibility tells the agent which operations' preconditions are met
Operations make state changes predictable, aiding visibility
Constraints ensure operations can't produce states that break visibility
History records which operations occurred, enabling recovery
Recovery is expressed through compensating operations

Computational Primitives

The operations a system must provide for agents to maintain state control:

For Visibility

Read state — Query current values and status
Verify effects — Confirm an operation had the expected result
Typed errors — Know why an operation failed

For Operations

Domain-specific verbs — Actions with business meaning (not generic CRUD)
Explicit preconditions — When each operation applies
Defined effects — What each operation changes
Composable sequences — Operations that chain into workflows

For Constraints

Precondition enforcement — Operations reject invalid invocations
Invariant checking — System-wide rules verified after each change
Atomic execution — Operations complete fully or not at all
Type validation — Invalid values rejected at the boundary

For Recovery

Checkpoint/restore — Save and return to known-good state
Inverse operations — Operations that undo other operations
Idempotent operations — Safe to retry on uncertainty
Escalation — Route to human when no operation clearly applies

For History

Operation logging — Record every invocation with context
Outcome capture — Store success/failure and resulting state
Query interface — Retrieve operation history

The Error Compounding Problem

Why state control matters more for agents than for humans:

With 99% per-decision accuracy:

10 decisions: 90% overall success
100 decisions: 37% overall success
500 decisions: 0.7% overall success

Errors compound geometrically. The only escape is error correction—detecting and fixing errors before they propagate.

Error correction requires state control:

Visibility to detect the error
Operations to know how to fix it
Constraints to limit damage
Recovery to execute the fix
History to understand what happened

Without state control, agents hit a wall: tasks beyond a certain complexity are guaranteed to fail. With state control, agents can operate indefinitely through continuous correction.

Summary

Reliable agent operation is state control:

Know the state. Discrete, observable, verifiable.

Provide meaningful operations. Domain-specific actions with clear preconditions, clear effects, and clear applicability. The agent reasons about what to do, not how to manipulate fields.

Constrain the state. Invalid states should be unreachable through the operations agents have access to.

Enable recovery. When something goes wrong, return to known-good state through inverse or compensating operations.

Record everything. Operations invoked are state too. Make them visible.

Every other concept—ACID, type systems, validation layers, checkpointing, error correction codes, feedback loops—is one of these five properties applied to a specific context.

Build systems that provide these properties, and agents can operate reliably. Skip any of them, and you're hoping the agent never makes a mistake—a hope that will be disappointed.