If you want agent behavior to be safer and more predictable, you need three different control layers.
- structured outputs
- guardrails
- execution boundaries
Those terms often get thrown together as if they describe one general safety layer.
They do not.
They solve different problems.
The short version is this:
Structured outputs control the shape of what the model produces. Guardrails control what should be allowed by policy. Execution boundaries control what the system can actually do.
That last layer matters most.
Why?
Because a model can produce perfectly valid JSON and still request the wrong action.
A guardrail can correctly flag something as risky and still fail open if the runtime is badly designed.
But a real execution boundary can still stop the side effect.
This article builds on Tool Use: How Agents Take Action, Goals, Constraints, and Success Conditions, Context Engineering: The New Core Skill, and When to Use a Workflow Instead of an Agent. Those pieces explain how agent systems act, what they are optimizing for, and when to keep autonomy narrow. This one explains how to bound that autonomy once the system is allowed to act.
Why Teams Confuse These Layers
The confusion is easy to understand.
All three layers are used to make model-driven systems feel more reliable.
All three sit somewhere between model output and real-world effect.
All three are often described using vague language like:
- safety
- control
- validation
- constraints
- guardrails
That vocabulary blurs important differences.
A schema validator is not doing the same job as a policy checker.
A policy checker is not doing the same job as a permission system.
A prompt that says never do anything dangerous is not an execution boundary at all.
The cleanest way to think about this is to separate the failure surfaces.
- Structured outputs reduce formatting failure.
- Guardrails reduce policy and misuse failure.
- Execution boundaries reduce side-effect and blast-radius failure.
If you collapse those into one bucket, you start trusting the wrong thing.
That is how teams end up saying things like:
It is safe because it returns JSONIt has guardrails, so it can call the toolThe prompt says ask for approval first
Those statements sound comforting.
They are also weak.
Structured Outputs: Control the Shape
Structured outputs constrain the format of the model response.
That usually means you define a schema and require the model to return data that matches it.
In practical terms, structured outputs are there to make downstream handling reliable.
They help with things like:
- required keys
- allowed enum values
- stable field names
- typed arguments
- machine-readable responses
This matters because agent systems often need model output to feed directly into:
- tool arguments
- workflow branches
- UI components
- approval requests
- logging and evaluation systems
Without structure, that handoff gets brittle quickly.
So structured outputs are valuable.
They are also limited.
They do not prove that the output is correct.
They do not prove that the action is allowed.
They do not prove that the model understood the request properly.
They do not stop a model from producing a perfectly well-formed bad decision.
That is the critical point.
Structured Outputs vs JSON Mode
This distinction matters because many teams still treat valid JSON as the main reliability target.
That is too weak.
JSON mode mainly helps ensure the output parses as JSON.
Structured outputs go further by pushing the model to follow an explicit schema.
That is a real improvement, but it is still a formatting guarantee, not a judgment guarantee.
So the right mental model is:
Structured outputs make outputs easier to handle. They do not make outputs safe to trust.
Guardrails: Control the Policy Layer
Guardrails sit one layer above format.
They are checks that inspect the request, the response, the tool call, or the state transition for policy risk.
Depending on the system, guardrails may look for:
- prompt injection attempts
- PII exposure
- disallowed topics
- unsafe content
- policy violations
- suspicious tool arguments
- low-confidence states
- requests that require escalation
A guardrail is basically a policy gate.
It asks:
Should this continue?
That is different from asking:
Is this valid JSON?
A model can produce a schema-perfect purchase request for an unauthorized vendor.
A model can output a well-typed tool call that still violates policy.
That is where guardrails belong.
They are there to detect that the request or response should not proceed as-is.
The Most Important Limitation of Guardrails
Guardrails are only as strong as the control path attached to them.
If a guardrail only logs a warning, it is not a real stop.
If a guardrail fails open under load, it is not a real stop.
If a guardrail lives only in the prompt and not in the runtime, it is definitely not a real stop.
So guardrails matter, but they should be treated as policy enforcement logic, not magic safety dust.
They improve the odds that the system notices risky behavior.
They do not remove the need for actual execution limits.
Execution Boundaries: Control the Power
Execution boundaries are the strongest layer because they define what the agent can actually do.
This is where control becomes real.
Execution boundaries include things like:
- tool allowlists
- read-only vs write access
- sandboxed execution
- filesystem scope
- network restrictions
- spending limits
- approval requirements
- credential scoping
- branch restrictions
- iteration limits
- timeout limits
This is the layer that answers:
Even if the model wants to do this, is the system actually allowed to do it?
That is why execution boundaries matter more than prompts and more than post-hoc warnings.
If the agent is not allowed to write outside a project directory, then a bad decision cannot escape that boundary.
If the agent is not allowed to make arbitrary network requests, then a malicious prompt injection has less room to cause damage.
If purchases above a threshold require human approval, then a mistaken model judgment cannot immediately create an irreversible side effect.
This is where What Is an AI Agent? becomes relevant again. As soon as the system can act in an environment instead of only answering in text, permissions and runtime limits become part of the core architecture.
Prompt Instructions Are Not Execution Boundaries
This is worth saying plainly.
Ask for approval before doing anything risky is not an execution boundary.
It is an instruction.
It may help.
It may even work most of the time.
But unless the runtime actually requires approval before the side effect can happen, the control is soft.
Real execution boundaries live in:
- the tool layer
- the runtime
- the sandbox
- the identity and credential model
- the workflow approval path
That is where the actual power should be constrained.
The F.P.E. Control Ladder
The cleanest way to design this stack is to use a simple rule.
Call it The F.P.E. Control Ladder.
F.P.E. stands for:
FormatPolicyExecution
Each step up the ladder controls a more consequential layer of the system.
F: Format
Use structured outputs to make model responses and tool arguments machine-reliable.
This reduces parsing errors, branching ambiguity, and brittle handoffs.
P: Policy
Use guardrails to inspect whether the request, response, or proposed action violates your rules.
This reduces misuse, unsafe behavior, and off-policy actions.
E: Execution
Use runtime permissions and bounded tooling to define what the agent can actually do.
This reduces damage even when the earlier layers fail.
The ladder matters because teams often invert it.
They spend energy on prompt wording and output formatting while leaving the actual side effects too open.
That is backwards.
The real design rule is:
Start by bounding power. Then enforce policy. Then make the interface clean and typed.
Format makes the system easier to use.
Policy makes the system safer to route.
Execution makes the system safer to trust.
A Running Example: A Purchase Agent
Use one example and hold it constant.
Suppose you are building an internal purchase agent.
Its job is to help employees request equipment and small software purchases.
The agent can:
- read the request
- classify the need
- gather vendor options
- prepare a purchase object
- send an approval request
- trigger procurement steps once approved
Now apply the ladder.
Format Layer
You require the model to return a structured purchase object like:
- item
- quantity
- vendor
- estimated cost
- justification
- policy category
- approval required
That helps your downstream system route the request predictably.
If the model forgets the cost field or invents an unsupported category, the schema catches it.
Good.
But that alone does not tell you whether the purchase is allowed.
Policy Layer
Now add guardrails.
The system checks things like:
- is the vendor approved?
- does the justification match policy?
- is the request suspicious or unrelated to work?
- does the model appear to be following a prompt injection from an uploaded document?
- is the user asking the agent to bypass procurement rules?
Now the system can flag or block policy-breaking requests.
Better.
But that still does not decide what the agent is capable of doing in the runtime.
Execution Layer
Now add execution boundaries.
The agent:
- cannot submit purchases above a fixed dollar threshold
- cannot send orders directly to arbitrary vendors
- cannot use unrestricted network access
- cannot bypass the approval API
- cannot execute procurement actions without a valid approval token
- cannot use credentials outside a narrowly scoped purchasing role
Now the system has real safety.
Even if the model:
- misunderstood the user
- produced a policy-bad suggestion
- got partially injected
- returned valid structured output for the wrong action
the runtime can still stop the consequence.
That is the architectural difference.
What Goes Wrong When One Layer Is Missing
The easiest way to understand the three-layer design is to look at failure cases.
Structured Outputs Without Guardrails
The system becomes neat but naive.
It produces clean objects for requests that still should not proceed.
This is the perfectly formatted bad idea failure mode.
Guardrails Without Execution Boundaries
The system notices more risk, but the actual blast radius stays too large.
This is the we saw the problem, but the system could still do it failure mode.
Execution Boundaries Without Structured Outputs
The system may be safe enough, but hard to integrate and brittle to operate.
This is the runtime is protected, but the interface is messy and error-prone failure mode.
Prompt Instructions Without Any of the Three
This is the weakest setup of all.
It relies on the model to self-police without strong structure, policy checks, or hard runtime limits.
That may be acceptable for low-stakes chat.
It is weak engineering for consequential action.
What Good Control Design Looks Like
Good control design usually has the following properties.
1. Typed Interfaces
Tool inputs, output objects, and state transitions should be structured enough that the rest of the system can reason about them reliably.
2. Policy Checks Near Decision Points
Guardrails should sit near the moments that matter:
- before tool execution
- before external communication
- before sensitive retrieval
- before irreversible actions
3. Least-Privilege Runtime Design
Agents should have the smallest amount of power needed for the task.
That design rule aligns directly with When to Use a Workflow Instead of an Agent. If the task does not need open-ended runtime choice, a deterministic flow is often easier to bound.
4. Approval Gates for High-Risk Transitions
Humans should be inserted where the downside of being wrong is materially larger than the cost of waiting.
That usually includes:
- money movement
- destructive writes
- external communications
- privileged changes
- regulated decisions
5. Observability Around the Whole Path
The system should log:
- what the model proposed
- what the guardrail flagged
- what the runtime allowed
- what was blocked
- what required approval
Without that, even a well-bounded system becomes hard to debug or improve.
The Practical Rule
If you only remember one rule from this article, use this one:
Structured outputs make actions legible. Guardrails make actions reviewable. Execution boundaries make actions containable.
You want all three.
But if you are forced to rank them by consequence, execution boundaries sit at the top because they are the last authority over side effects.
That is the mature agent-engineering view.
Do not ask the model to be the boundary.
Design the boundary around the model.
The natural next steps after this article are orchestration patterns, approval design, and trajectory-level evaluation. Once control boundaries are clear, you can make better decisions about supervisor and planner-executor structures, and about where a human should remain in the loop.
FAQ
Aren’t structured outputs and guardrails basically the same thing?
No.
Structured outputs constrain the format of the model response.
Guardrails inspect the request, response, or proposed action for policy risk.
One is mainly about machine-readable shape.
The other is mainly about whether the content or action should be allowed.
If the model returns valid JSON, isn’t that good enough?
No.
Valid JSON only tells you the response is parseable.
Even schema-valid output can still contain:
- the wrong recommendation
- an unauthorized action
- a policy violation
- a hallucinated argument
That is why structured outputs improve reliability, but do not by themselves make agent actions safe.
What is the difference between structured outputs and JSON mode?
JSON mode mainly helps ensure the output is valid JSON.
Structured outputs go further by requiring adherence to a defined schema.
That makes downstream handling much more reliable, but it still does not prove that the content is correct or safe.
What is the difference between a guardrail and an execution boundary?
A guardrail checks whether something should proceed.
An execution boundary determines whether it can proceed at all.
For example:
- a guardrail might detect that an email draft contains sensitive data
- an execution boundary might prevent the send action unless a reviewer approves it
Both matter, but the second one is the harder stop.
Are human approvals guardrails or execution boundaries?
They are usually better treated as execution controls or workflow controls.
The important point is not the label.
The important point is that the side effect cannot happen without the approval event.
If the model merely promises to ask first, that is not enough.
Can structured outputs stop prompt injection?
Not on their own.
They can reduce some forms of freeform output chaos, but they do not solve the underlying problem of malicious or conflicting instructions entering the system.
Prompt injection needs broader defenses such as input handling, policy checks, tool restrictions, and bounded runtime behavior.
Do workflows need these controls too, or only agents?
They matter anywhere model output can trigger consequential behavior.
Agents usually need them more urgently because they make more runtime choices, but workflows still benefit from schemas, policy checks, and execution limits when the model influences real actions.
Where should the human stay in the loop?
Keep the human at transitions where:
- the action is irreversible
- the cost of a bad action is high
- the policy is ambiguous
- the available evidence is incomplete
- the organization needs an accountable approval point
That is the natural bridge into the next control-design article on human-in-the-loop systems.
Do guardrails replace least-privilege permissions?
No.
Guardrails and permissions solve different problems.
Guardrails help detect risky or off-policy behavior.
Least-privilege permissions reduce the amount of harm the system can do even when judgment fails.
What comes after this topic in the learning path?
Once you understand control boundaries, the next concepts are:
- supervisor, router, and planner-executor patterns
- human-in-the-loop approval design
- trajectory-level evaluation and observability
Those are the next layers that turn a bounded agent into a production-ready system.