Article

AI Agent Frameworks

Most framework comparisons are weaker than they look because they compare tools that live at different layers of the stack. The real decision is not just which framework is popular. It is which control surface your team actually needs.

The phrase AI agent frameworks is now doing too much work.

People use it to describe:

That is why so many framework comparisons feel noisy.

They are often comparing unlike things.

A team asks:

Which framework should we use?

but the more useful question is:

Which layer of control do we actually need?

That is the real category problem.

This article is not a leaderboard.

It is a map.

It explains what an AI agent framework actually is, what does not belong in the same bucket, and how to choose among the current options without letting marketing language do your architecture for you.

This article connects naturally to When to Use a Workflow Instead of an Agent, Supervisor, Router, and Planner-Executor Patterns, What Is Agent Engineering?, How Good Agent Memory Actually Works in Production, and OpenAI Codex as a Coding-Agent Platform. Those pieces explain when autonomy helps, how orchestration patterns differ, what the broader discipline is, how memory surfaces shape architecture, and how one concrete platform should be understood. This one focuses on the category between those ideas: the frameworks and build surfaces teams reach for when they want more structure than raw API calls.

What Counts as an AI Agent Framework

At a useful level, an AI agent framework is a developer layer that gives you reusable structure for building agent systems.

That usually means it helps with some combination of:

The important part is not the branding.

It is the added structure.

A plain model API client is not usually enough on its own.

A full managed platform is often more than a framework.

So the right definition is somewhere in the middle:

an AI agent framework is a reusable software layer for building and operating agent behavior beyond direct model calls

That still leaves a lot of room.

Which is exactly why the category gets messy fast.

Why the Category Is Confusing

Current documentation surfaces are making the confusion worse, not better.

Some tools present themselves as:

and many of those claims are directionally true.

The problem is that they are not claims about the same layer.

For example:

Those are not fake distinctions.

They are real.

But they are also distinctions between different kinds of build surfaces.

So when someone compares LangGraph vs CrewAI vs OpenAI Agents SDK as if they were the same kind of tool, the comparison is already partially wrong.

They overlap.

They do not occupy the same architectural layer.

The S.T.A.C.K. Lens

A better way to compare AI agent frameworks is the S.T.A.C.K. lens:

This is not a feature checklist.

It is a way to ask what kind of system layer a framework is really offering you.

State

How does the framework think about state?

Does it give you:

This matters because state is where a lot of the real platform difference begins.

Some tools are lightweight around state.

Others are built around persistent, resumable runs.

That is not a cosmetic difference.

It changes what kinds of systems they fit well.

Tools

What tool model does the framework assume?

Does it mainly expose:

Different frameworks are opinionated here.

Some treat tools as the center of the loop.

Some treat them as one component inside a broader graph or workflow model.

If your system is mostly about tool execution and bounded agent turns, one kind of framework fits better.

If your system is really about a larger event-driven workflow, another category fits better.

Abstraction

How much architecture is already decided for you?

This is where framework selection gets real.

Some tools give you:

Others stay deliberately low-level and expect you to compose the system yourself.

Neither is automatically better.

High abstraction can speed up early work.

Low abstraction can preserve control when the system becomes weird, expensive, or reliability-sensitive.

Control

Where does explicit control live?

This is one of the most important questions.

Does the framework make it easy to express:

or does it encourage looser autonomous behavior and hide the control plane behind easier abstractions?

If your team needs exact execution semantics, the answer matters more than almost any benchmark table.

Kernel

What is the real runtime kernel of the framework?

In other words:

what is the deepest thing it is actually built around?

That kernel might be:

This is the layer many comparisons miss.

Frameworks that look similar on the surface can feel very different because their kernel is different.

That is usually the real reason one tool fits and another does not.

The Main Categories That Matter Right Now

The market is crowded, but the useful categories are not endless.

They cluster into a few main groups.

1. Low-Level Orchestration Frameworks

This is the category LangGraph represents most clearly.

Its own docs position it as a low-level orchestration framework and runtime for long-running, stateful agents, with emphasis on:

That is not the same thing as a high-level multi-agent toolkit.

It is closer to an orchestration substrate.

This category fits teams that want:

The tradeoff is predictable:

you get more control, but you also own more design work.

2. Higher-Level Multi-Agent Abstraction Frameworks

CrewAI is the clearest example here.

Its current docs are built around the split between:

That is already a strong category signal.

CrewAI is not only giving you primitives.

It is giving you a more opinionated model for how autonomous teams of agents fit inside a larger workflow.

That is useful for teams that want:

The tradeoff is also clear:

the more abstraction you inherit, the more carefully you need to inspect whether that abstraction still fits once the system gets more complex.

3. Model-Native Agent SDKs and Runtimes

The OpenAI Agents SDK sits here most clearly.

Its docs are unusually explicit about the category boundary:

That is an important distinction.

This is less of a generic framework for every architecture and more of a model-native runtime/SDK layer for agent systems.

It fits teams that want:

without necessarily adopting a graph-first or crew-first abstraction.

This category is often misunderstood because people compare it directly to orchestration frameworks as if they were solving the exact same layer.

They are not.

4. Platform-Backed Agent Frameworks

Google ADK and Microsoft Agent Framework belong more here.

They are frameworks.

But they are also clearly tied to broader platform stories.

Google ADK presents itself as a framework that can start simply, then expand into:

Microsoft Agent Framework is similarly explicit that it combines:

and positions itself as the successor to AutoGen and Semantic Kernel.

These are not just lightweight libraries.

They are broader ecosystem-backed development surfaces.

That can be a strength if your team wants:

It can also be a constraint if you wanted a thinner, less ecosystem-shaped layer.

5. Framework-Adjacent Application and Agent Builders

This is where tools like Mastra, Pydantic AI, and LlamaIndex become interesting.

They are real framework surfaces.

But they also carry stronger identity around a specific developer motion.

Mastra frames itself as a modern TypeScript framework and platform for AI-powered applications and agents, with strong emphasis on:

Pydantic AI frames itself as a Python agent framework focused on type-safe, production-grade agent development with broad model support and tight observability integration.

LlamaIndex still reads most clearly as a framework for building agentic systems over your data, with workflows and context augmentation as first-class concerns.

These are not all the same kind of product.

But they share a trait:

they are framework surfaces shaped around a particular developer center of gravity:

That matters because sometimes the best framework choice is not about a general category at all.

It is about where your team already lives.

What Most Teams Get Wrong

The most common mistake is to ask:

Which framework is winning?

That is usually the wrong first question.

The better first questions are:

A lot of framework pain comes from abstraction mismatch.

Teams choose:

That is why the category feels harder than it should.

People are often choosing a story, not a layer.

How to Choose Without Fooling Yourself

A simple selection rule is:

Start Lower When Reliability and Control Matter Most

If your system is:

then lower-level orchestration and explicit workflow control usually age better.

You will do more upfront work.

You will also understand your system better.

Start Higher When Speed and Team Throughput Matter Most

If your team wants:

then higher-level abstraction can be worth it.

Just do not confuse speed of first demo with long-term control quality.

Prefer Platform Alignment Only When It Actually Helps

A platform-backed framework can help if:

But platform alignment is not automatically architectural clarity.

Sometimes it is just gravity.

Avoid the Framework If Plain Workflows Are Enough

This point matters more than most framework vendors would like.

If the job is mostly:

then When to Use a Workflow Instead of an Agent still applies.

You do not get engineering points for introducing a framework you do not need.

My View

AI agent frameworks is a real category.

It is also a messy one.

The way to make it cleaner is not to hunt for the one best framework.

It is to stop pretending every framework is solving the same problem.

Some are:

That is a healthier map.

And once you use that map, a lot of the noise disappears.

The right framework choice is usually not about popularity.

It is about what layer of control your team actually needs.

FAQ

Do most teams need an AI agent framework?

No.

Some teams need one.

Many teams first need a smaller amount of explicit workflow code, better tool design, and clearer control boundaries before a framework adds real value.

What is the difference between a framework and an SDK here?

An SDK usually gives you programmatic access to a model-native or product-native runtime surface.

A framework usually gives you broader reusable structure for building agent behavior, state, control flow, or orchestration.

The problem is that many current tools blur the line.

Which framework is best?

There is no stable single answer.

The better question is:

which control surface fits the system you are actually building?

That is why low-level orchestration tools, higher-level multi-agent frameworks, model-native SDKs, and platform-backed frameworks should not all be compared as if they were interchangeable.

Do I need a framework before I can build a real agent system?

No.

A lot of real systems start with:

and only adopt a framework when the control surface gets too repetitive to manage cleanly by hand.

How should I think about frameworks versus workflows?

A workflow is usually an execution shape.

A framework is a reusable software layer.

Some frameworks are built around workflows.

Some are built around agent loops.

Some combine both.

That is one reason the category is so easy to flatten by mistake.

The better question is:

which framework category fits your needed state model, tool model, abstraction level, and control surface?