Supervisor, Router, and Planner-Executor Patterns

Once you move beyond one generic agent loop, the next design question is not which framework should I use?

It is:

Where should control live?

That is the real question behind supervisor, router, and planner-executor patterns.

These are not interchangeable multi-agent buzzwords.

They solve different orchestration problems.

The short answer is this:

a router decides where the task should go
a planner decides how the task should be broken down
an executor carries out that plan
a supervisor stays in charge while the work is happening

That is the simplest useful distinction.

This article builds on Planning and Task Decomposition, ReAct and the Basic Reasoning Loop, Tool Use: How Agents Take Action, When to Use a Workflow Instead of an Agent, and Structured Outputs, Guardrails, and Execution Boundaries. Those pieces explain planning, local reasoning loops, tool execution, bounded autonomy, and runtime control. This one explains how those components get arranged once one undifferentiated loop is no longer enough.

Why Orchestration Patterns Start to Matter

For simple tasks, one agent may be enough.

If the job is:

answer a question
summarize a document
call one tool
perform one obvious workflow

then extra orchestration often just adds cost and confusion.

But once the work starts to involve:

multiple specialties
different tool domains
ambiguous task routing
multi-step planning
parallel investigation
synthesis across different outputs

the main question becomes architectural.

Who decides what happens next?

Is the task handed off once and then owned by something else?

Is the plan made upfront and then executed mechanically?

Or is there a central controller that keeps deciding throughout the run?

That is where these patterns become useful.

They are not just implementation styles.

They are different answers to the question of where authority lives across the task lifecycle.

Router Pattern: Choose the Lane

A router pattern is the lightest of the three.

Its job is to decide where the work should go.

That usually means:

choosing the right specialist agent
choosing the right workflow
choosing the right tool family
choosing the right model tier
choosing the right processing path

The key point is that the router normally does not stay deeply involved after the handoff.

It dispatches.

Then it steps back.

That is why a router is best understood as a traffic controller, not a project manager.

What a Router Is Good At

Routers work best when the main challenge is classification.

For example:

billing question vs technical-support question
legal-doc extraction vs email drafting
simple support case vs fraud-review case
small model vs large model

In those cases, the real problem is:

Which lane should handle this request?

That is a routing problem.

Router vs Supervisor

This is the distinction many readers need most.

A router says:

This belongs over there.

A supervisor says:

I am still responsible for this task, and I will decide who does each part, when, and how the outputs come back together.

That is a much bigger job.

When Not to Use a Router

Do not use a router when the task needs ongoing cross-specialist coordination.

If the system must:

ask multiple specialists for partial work
compare their outputs
resolve conflicts
decide whether another step is needed
synthesize the final answer

then the problem is no longer just dispatch.

At that point, a router alone is too thin.

Also note that not every router needs an LLM.

If the lanes are stable and the criteria are obvious, rules are often better.

That is the same lesson from When to Use a Workflow Instead of an Agent: do not spend dynamic reasoning on a choice that can be encoded simply.

Planner-Executor Pattern: Separate Strategy From Action

The planner-executor pattern splits the work into two roles.

One component creates the roadmap.

Another component executes the roadmap.

That is the core idea.

The planner answers:

what substeps are needed?
what order should they happen in?
what dependencies exist?
what should be verified before moving on?

The executor answers:

can this step be completed now?
what tool call or action is required?
what was the result of that step?

So planner-executor is not mainly about dispatching between specialists.

It is about separating strategic decomposition from tactical action.

Why Teams Use It

This pattern is attractive because it creates more structure than a raw ReAct loop.

Instead of rethinking the whole task after every small step, the system can build an initial roadmap and then work through it.

That often helps with:

long tasks
large objectives
repeatable investigation flows
cost control
more legible progress tracking

The system does more of its thinking upfront.

That can make execution more predictable.

Planner-Executor vs ReAct

This distinction needs to be explicit.

ReAct is a local reasoning loop:

think
act
observe
think again

Planner-executor is a higher-level orchestration split.

It decides that planning and doing should be handled as separate roles.

So the difference is not:

ReAct = one agent
planner-executor = many agents

That is too simplistic.

A planner-executor system might use:

one planner and one executor
one planner and several executors
one planner with deterministic execution steps

The real distinction is structural.

ReAct keeps reasoning tightly interleaved with execution.

Planner-executor moves more of the reasoning into an earlier planning phase.

Main Tradeoff

Planner-executor is often stronger than a pure loop when the task can be decomposed well.

But it can become brittle if the world changes in the middle of execution.

If the plan is wrong, incomplete, or overtaken by new evidence, the system may need replanning.

That is why this pattern works best when the environment is uncertain enough to need planning, but not so unstable that the plan becomes obsolete immediately.

Supervisor Pattern: Retain Control Across the Run

A supervisor pattern is the heaviest of the three.

The supervisor stays in charge while the work is happening.

It does not just hand the task off once.

It manages the lifecycle.

That usually means the supervisor can:

break the task into parts
assign work to specialists
inspect intermediate outputs
decide whether to continue, retry, escalate, or stop
synthesize the final answer

So the key difference is authority retention.

The router transfers authority.

The planner embeds authority into the plan.

The supervisor keeps authority centrally and uses it dynamically.

That is why the supervisor pattern looks most like a manager.

Supervisor vs Orchestrator-Worker

Many framework docs describe something like orchestrator-worker.

In practice, that usually lands closest to the supervisor family.

Why?

Because the orchestrator is not merely classifying and dispatching once.

It is typically:

assigning work
reviewing intermediate outputs
deciding whether more work is needed
combining results into the final answer

That is supervisor behavior.

The exact labels vary by framework.

The underlying control shape is what matters.

When a Supervisor Makes Sense

Use a supervisor when the work needs:

ongoing coordination
multiple specialists with different capabilities
conflict resolution between outputs
adaptive delegation
synthesis across several partial results

This is most useful on ambiguous tasks where the right sequence is not obvious upfront and one simple handoff is not enough.

What It Costs

Supervisor systems are powerful.

They are also expensive.

They usually mean:

more model calls
more latency
more context passing
more tracing complexity
more failure modes in coordination

That is why many teams overuse them.

A supervisor feels architecturally advanced.

But if the task only needed routing or a bounded planner-executor split, the supervisor is often overhead rather than progress.

The A.L.M. Lens

The cleanest way to distinguish these patterns is to use one comparison lens.

Call it The A.L.M. Lens.

A.L.M. stands for:

Authority
Lifecycle
Management

The point of the lens is simple:

Ask when decisions are made and who still owns the task after each decision.

Router

Decision timing: immediate
Authority: transferred
Lifecycle ownership: minimal after handoff

The router chooses the lane and exits.

Planner-Executor

Decision timing: front-loaded
Authority: embedded in the plan
Lifecycle ownership: split between planning and execution

The planner decides the roadmap first.

The executor carries it out.

Supervisor

Decision timing: continuous
Authority: retained centrally
Lifecycle ownership: active throughout the run

The supervisor keeps deciding as the task unfolds.

That is why this lens is useful.

It does not depend on LangGraph terminology, AutoGen terminology, CrewAI terminology, or any vendor taxonomy.

It explains the patterns by control shape.

That is the durable distinction.

Quick Comparison

Pattern	When decisions happen	Who owns the task after the first decision?	Best for	Main failure mode
Router	upfront	the destination workflow or specialist	classification and dispatch	pretending dispatch is enough when the task really needs coordination
Planner-executor	mostly upfront, with occasional replanning	the plan during execution	decomposable work with a legible roadmap	brittle execution when new evidence invalidates the plan
Supervisor	continuously through the run	the supervisor	ambiguous, cross-specialist, adaptive work	coordination overhead, latency, and cost

How These Patterns Map to Real Systems

The labels vary across frameworks.

The control shape is more stable than the product vocabulary.

That is why it helps to map the pattern names back to the common public implementations.

Anthropic’s public guidance leans on simple composable patterns such as routing, prompt chaining, and orchestrator-worker structures. That maps closely to the distinction in this article between dispatch, staged planning, and retained supervision.
OpenAI’s agent-building guidance separates single-agent loops from multi-agent orchestration and includes handoffs and manager-style coordination. In practice, that means routers and supervisors are different jobs, even when both are implemented with agents.
Microsoft’s architecture guide breaks multi-agent systems into orchestration patterns such as handoff, sequential, concurrent, and magentic orchestration. Those patterns overlap with the same design question here: who keeps authority, and for how long?
LangChain and LangGraph are especially explicit about the difference between a router and a supervisor. Their current docs describe routers as lightweight dispatch and supervisor-style subagents as centralized, conversation-aware control.

So if you see terms like:

handoff
manager pattern
orchestrator-worker
subagents
routing
plan-and-execute

do not ask whether the names match perfectly.

Ask which control shape they represent.

That question is more transferable than any one framework taxonomy.

One Example Across All Three: Support Operations

Use one example and keep it stable.

Suppose a company is building an AI support system for incoming customer issues.

The system receives this request:

My laptop was charged twice, the replacement order is missing, and support already promised a refund.

Now compare the three patterns.

Router Version

A router classifies the issue and sends it to:

billing support
shipping support
account support

Whichever lane it chooses becomes the new owner.

This is strong when the main goal is triage.

It is weak when the issue clearly spans several domains.

Planner-Executor Version

A planner breaks the task into a roadmap such as:

verify the duplicate charge
check replacement-order status
inspect prior support history
decide whether refund conditions are already met
execute the approved remediation steps

An executor or set of executors then performs those steps in sequence.

This is strong when the workflow can be structured well enough upfront.

It is weaker when new evidence changes the sequence materially halfway through.

Supervisor Version

A supervisor keeps ownership and dynamically delegates:

ask billing specialist to confirm the duplicate charge
ask shipping specialist to inspect the replacement order
ask policy specialist to verify refund authority
compare the results
decide whether to refund, reship, escalate, or ask for more evidence

This is strong because the task is cross-functional and ambiguous.

It is expensive because the coordination itself becomes a major part of the run.

Hybrid Version

In real systems, the best design often combines patterns.

For example:

a router triages simple requests
complex multi-domain cases go to a supervisor
the supervisor uses a planner-executor split for one bounded investigation workflow

That is normal.

These patterns are composable.

The mistake is to treat them as mutually exclusive product categories instead of reusable control shapes.

Which Pattern Should You Start With?

Start with the smallest pattern that matches the real control problem.

That is the practical rule.

Start With a Router When

the main difficulty is classification
the lanes are distinct
cross-specialist synthesis is rare
the work should move to one owner quickly

Start With Planner-Executor When

the task needs decomposition
the roadmap can be made useful upfront
execution can follow that plan with limited adaptation
you want more structure than a pure ReAct loop

Start With a Supervisor When

authority needs to stay central
several specialists must contribute
outputs may conflict
the next move depends on intermediate results across the task

Do Not Start With a Supervisor By Default

This deserves a direct warning.

Many teams hear multi-agent and jump straight to a supervisor architecture.

That is often premature.

If the system can succeed with:

simple routing
deterministic orchestration
one planner and one executor

then start there.

That is easier to:

trace
evaluate
govern
debug
bound

And it aligns with the broader site principle that autonomy should be earned, not assumed.

The Practical Rule

If you only remember one distinction, use this one:

Routers dispatch. Planners structure. Supervisors stay responsible.

That is the cleanest first-principles summary.

And if you need the design rule behind it, use this:

Choose the pattern based on where authority should live after the first decision.

If authority should transfer, route.

If authority should be embedded into an upfront roadmap, plan and execute.

If authority must stay active and central throughout the run, supervise.

That is the real architecture choice.

The next topics naturally move from here into human-in-the-loop control design and then into evaluation and observability, because once orchestration patterns are clear, the next question is how to control and measure them.

FAQ

Isn’t a router just a simple supervisor?

No.

A router mainly decides where the task should go.

A supervisor keeps ownership of the task and continues coordinating while the work is happening.

That ongoing authority is the main difference.

Is planner-executor the same thing as multi-agent?

No.

Planner-executor is a role split, not a synonym for multi-agent.

You can have:

one planner and one executor
one planner and many executors
a planner-executor system inside a broader supervised architecture

The core idea is the separation between roadmap creation and execution.

When should I use a router instead of a supervisor?

Use a router when the main problem is dispatch:

Which lane?
Which specialist?
Which workflow?

Use a supervisor when the main problem is coordination:

Which subtask next?
Which specialist now?
Are the current results enough?
How do these partial outputs fit together?

How is planner-executor different from ReAct?

ReAct is a local loop where reasoning and action stay tightly interleaved.

Planner-executor moves more of the reasoning into an earlier planning phase and then executes against that roadmap.

So ReAct is an execution pattern.

Planner-executor is an orchestration pattern.

Can one system use all three patterns?

Yes.

That is common in production.

A system might:

route simple requests
supervise only the complex ones
use planner-executor inside one specialist workflow

These are composable patterns.

Which pattern is cheapest?

Usually:

router is cheapest
planner-executor is next
supervisor is most expensive

That is not a law, but it is the normal pattern because continuous coordination costs more than one dispatch decision or one upfront plan.

Where should the human sit in these patterns?

At high-risk transitions.

That includes:

approvals
irreversible actions
ambiguous policy decisions
escalations
conflict resolution when the system is unsure

The orchestration pattern does not remove the need for human control. It changes where that control should be inserted.

What comes after orchestration patterns in the learning path?

The next clean topics are:

human-in-the-loop control design
evaluating agent trajectories, not just outputs
tracing and observability for orchestrated systems

That is where orchestration moves from architecture into production discipline.