Untitled

Pawel Zimoch · ~8 min read · Essay 12

Counter-Arguments and Objections — For Review

This document catalogs counter-arguments, alternative approaches, and failure cases that could be addressed in the essays. Review these to decide which deserve treatment in the essays.

1. Counter-Arguments to the Core Thesis

"Models will get smart enough to not need structure"

The objection: Current structure requirements are a temporary limitation. Future models with better reasoning, larger context windows, and improved world models will handle unstructured domains reliably.

Potential response:

Structure isn't a crutch for limited capability—it's how reliability becomes possible at all
Even perfect reasoning needs determinate success criteria
More capability without structure means faster, more systematic errors
The halting problem / Gödel argument (though the formal analogy is loose)

Strength of objection: Medium. This is the most common objection and deserves direct treatment.

"Structure kills creativity and flexibility"

The objection: By encoding everything into explicit rules, you lose the human ability to adapt, improvise, and handle genuinely novel situations. Rigid structure produces bad outcomes at the edges.

Potential response:

Structure ≠ rigidity (the exception problem essay addresses this)
Well-designed structure includes escape hatches
"Structured flexibility" is the answer—bounded exceptions, not unbounded discretion
Creative domains might be genuinely different (worth acknowledging)

Strength of objection: Medium-High. The creative domain counter-example is underexplored.

"The cost of structure exceeds benefits for most use cases"

The objection: Building explicit structure is expensive. For many processes, the ROI doesn't justify the investment. Humans filling gaps is cheaper than engineering precise rules.

Potential response:

This was true before agents—agents change the economics
Structure investment compounds; gap-filling doesn't
The question is: do you want reliable automation or not?
Acknowledge that some domains genuinely aren't worth structuring

Strength of objection: High. The essays somewhat hand-wave cost/benefit analysis.

"This is just traditional software engineering dressed up"

The objection: Everything in these essays—state machines, validation, type systems, domain modeling—has been known for decades. There's nothing new here.

Potential response:

Yes, and that's the point—these fundamentals apply to agents too
What's new: agents make structure economically viable for domains where it wasn't
The insight is that AI doesn't eliminate these requirements; it makes them more important
The field is rediscovering known principles in a new context

Strength of objection: Medium. Fair point—the essays could be more explicit about building on established patterns.

2. Alternative Approaches

Checking agents instead of validation layers

The approach: Use a second, more powerful agent to verify the first agent's outputs instead of building explicit validation.

What the essays say: This works for judgment calls but doesn't eliminate structure—just relocates it. Two agents can agree on wrong answers.

What's underexplored:

When IS agent-based checking the right choice?
How to combine validation layers with checking agents
The cost/latency tradeoffs

General-purpose tool use vs. DSLs

The approach: Give agents access to general-purpose tools (APIs, code execution) rather than domain-specific languages.

What the essays say: This recovers all the original problems—arbitrary actions, violated invariants, unpredictable failures.

What's underexplored:

Hybrid approaches (general tools with DSL-like constraints)
Whether some domains genuinely need general-purpose capability
Gradual migration paths from general tools to DSLs

Big-bang structure design vs. incremental

The approach: Invest heavily upfront in comprehensive domain modeling before deploying any agents.

What the essays say: This doesn't work because structure must be discovered through operation.

What's underexplored:

When IS upfront investment worthwhile?
How much structure is "enough" to start?
Domains where discovery is minimal (heavily regulated, highly stable)

Constitutional AI / RLHF as alternative to external structure

The approach: Train models with better values and judgment, reducing the need for external constraints.

What's missing from essays:

These approaches are complementary, not alternatives
Training can help but doesn't provide determinate success criteria
The verification problem remains

3. Failure Cases and Edge Cases

Domains where structure exists but agents still fail

Examples to explore:

Heavily structured enterprise software (SAP, Salesforce) where agents struggle
Gaming (highly structured) but AI game-playing is unreliable outside narrow cases
Financial trading (extremely structured) but automation failures still common

Questions: What's missing in these cases? Is it structure quality, interface design, or something else?

Over-structured systems that became brittle

Examples to explore:

Healthcare systems where rigid protocols cause harm in edge cases
Customer service systems where "computer says no" destroys relationships
Compliance systems that optimize for rule-following over outcomes

Questions: How do you distinguish good structure from bureaucratic ossification?

When the boundary model breaks down

Scenarios:

Domains where there's no natural "first judgment" to start with
Processes where judgments are deeply entangled (can't isolate one)
Situations where UNKNOWN rate stays persistently high
Cases where human escalation capacity is genuinely insufficient

Questions: What do you do when incremental deployment doesn't work?

Structure discovery that stalled

Scenarios:

Teams that iterated for years without converging on good structure
Domains where the rules genuinely keep changing faster than structure can adapt
Organizations where political resistance prevented structure from crystallizing

Questions: How do you know when to give up on structuring a domain?

4. Open Objections to Address

What about emergent behavior requirements?

The objection: Some valuable agent behaviors can't be specified upfront—they emerge from the interaction of capabilities. Requiring explicit structure precludes beneficial emergence.

Possible responses:

Distinguish "good emergent" (creative solutions) from "bad emergent" (unpredictable failures)
Structure can bound emergence without eliminating it
Most business processes don't actually want emergence—they want reliability

How does this apply to creative tasks?

The objection: Creative work (writing, design, brainstorming) doesn't have "correct answers" or "valid states." The whole framework seems inapplicable.

Possible responses:

Creative tasks might genuinely be different (acknowledge the limit)
Even creative work has constraints (brand guidelines, audience needs, project scope)
Distinguish "generation" (creative) from "evaluation" (can be structured)
Creative tasks may benefit less from full automation anyway

What about multi-agent systems?

The objection: These essays focus on single-agent architectures. What about systems where multiple agents collaborate, negotiate, or compete?

What's missing:

Inter-agent communication needs structure too
Coordination protocols, shared state, conflict resolution
How the boundary model applies when agents are the "humans" for each other

Isn't this just for "boring" enterprise work?

The objection: Structure makes sense for back-office operations, but the exciting AI applications are in open-ended domains where this doesn't apply.

Possible responses:

Most economic value is in "boring" work
Open-ended applications might remain human-operated longer
Even "exciting" domains often have structured subproblems
The essay collection explicitly focuses on business operations (acknowledge scope)

What about agents that learn and adapt?

The objection: Static structure doesn't capture domains that evolve. Agents should learn the structure as they operate, not have it imposed externally.

Possible responses:

Learning structure is fine—the issue is verifying what was learned
Agent-discovered structure still needs external validation
"Who specifies the specification" essay explores this

5. Missing Evidence

What we'd ideally have:

Case studies of deployments that failed due to lack of structure
Case studies of deployments that succeeded with structure
Quantitative data on error rates: structured vs. unstructured
Longitudinal data on structure discovery timelines
Cost/benefit analysis of structure investment
Comparison of DSL approach vs. general tool use in same domain

Why we don't:

Field is too new for academic studies
Companies don't publish failure data
Success stories are marketing, not rigorous analysis
Proprietary implementations aren't documented

Recommendations

High priority (address in essays):

Cost/benefit of structure investment — The essays assert structure is valuable but don't quantify. At minimum, acknowledge the tradeoff explicitly.
Creative and open-ended domains — Be more explicit about the scope of the framework. Not everything fits.
When the boundary model doesn't work — The incremental approach is presented as universal. Acknowledge its limits.

Medium priority (consider addressing):

Hybrid approaches — General tools with DSL-like constraints. Real systems will be hybrid.
Agent-based checking — When and how to combine with validation layers.
Multi-agent coordination — Growing in relevance as agentic systems become more complex.

Lower priority (acknowledge but don't elaborate):

Constitutional AI / training-based approaches — Complementary, not competing.
"Just software engineering" objection — Acknowledge and move on.
Learning/adapting agents — Covered somewhat in "who specifies the specification."