The Orchestrator Developer #2 — Why a Single Agent Is Not Enough

In the first article of this series, I explained why agentic AI truly changes web development. Not just because it accelerates certain tasks, but because it transforms the very organization of work. From the moment AI enters the workflow, reads files, modifies code, runs commands, prepares deliverables, and participates in execution chains, we’re no longer talking about simple assistance. We’re talking about a production system to be piloted.

From there, an idea comes up very often. It’s seductive, intuitive, almost inevitable: why not have a single, very powerful agent that does everything?

One agent to understand the request.
One agent to produce the architecture.
One agent to code.
One agent to secure.
One agent to test.
One agent to package.
One agent to document.
One agent to review.
One agent to deliver.

Or rather, to put it differently: why not ask a single agent to play all these roles at once?

On paper, the idea seems perfect. A single interface. A single loop. A single logic. A single point of contact. It’s simple, comfortable, seductive. It gives the impression of a fluid, almost magical system. You talk to it, it understands, it chains tasks together, it delivers.

The problem is that this vision mostly holds as long as you stay on short, well-bounded tasks, or on demonstrations.

As soon as you enter real web development, things get complicated very quickly.

And that’s precisely where you have to accept a less spectacular, but much more solid reality: a single agent is not enough.


The Super-Agent Fantasy Is Understandable

Let’s start by acknowledging one thing: this idea isn’t absurd. It’s attractive because it answers an old dream of simplification.

For years, software development has been stacking layers of complexity. More tools. More frameworks. More pipelines. More standards. More validations. More dependencies. More context. So naturally, the idea of a single agent capable of absorbing all this complexity and transforming it into a clean deliverable has something very seductive about it.

It’s also reassuring because it echoes a familiar logic: that of the versatile developer who can do almost everything. Understand the need, design, implement, fix, document, secure, deliver. The single agent then becomes a kind of augmented projection of this figure.

And it’s true that at first, it can work.

On a limited task, a well-guided agent can propose a coherent plan, generate an acceptable implementation, fix a few errors, improve the form, draft a base of documentation, and give the impression of astonishing continuity.

That’s often what creates the first wave of enthusiasm.

You tell yourself: “Alright, we’re there. We just need to give it enough context and enough instructions.”

But this impression of fluidity masks a structural fragility.


A Single Agent Ends Up Carrying Everything… So Mixing Everything

The real problem isn’t that a single agent is incapable of producing something. The real problem is that it has to carry too many layers of work at the same time.

Understanding a request is not the same thing as designing an architecture. Designing an architecture is not the same thing as producing a clean implementation. Producing an implementation is not the same thing as reasoning about security. Reasoning about security is not the same thing as thinking about packaging, review, compatibility, or maintenance.

In a human system, we’ve known this for a long time. We know that a good developer isn’t automatically a good reviewer. That an excellent architect isn’t necessarily the best person to validate a back-office UX. That a security specialist doesn’t look at the same mental object as a performance specialist. That a packager, a QA, or a release manager don’t pay attention to the same details.

Yet, when it comes to AI, many fall back into the idea that a single loop could absorb everything.

The problem is that this single loop quickly becomes a place of confusion. It has to keep track of the need, the existing codebase, business constraints, architecture, trade-offs, code quality, technical validations, tests, security, documentation, and the final output. It also has to decide when to explore, when to produce, when to verify, when to fix.

In other words, it has to think about multiple jobs at once.

And the more it tries to do everything in a single trajectory, the more it risks smoothing out the differences between these roles. It then gives the illusion of global coherence, while in reality it’s mixing levels of responsibility that would benefit from being separated.


Real Development Is Not a Continuous Demo

This is often where the difference between an impressive demo and reliable usage appears.

In a demo, you can very well ask an agent to produce a feature end to end. It understands the request, proposes something, modifies code, fixes a few errors, announces it’s done. The loop seems elegant. It works well because it stays short, framed, and observed.

But a real project is not an extended demo.

A real project contains gray areas, contradictions, poorly documented dependencies, historical conventions, old compromises, elements you absolutely must not break, decisions that only make sense in context, rules that are written nowhere but still structure the expected quality.

In this type of environment, the single agent starts showing its limits. Not necessarily by failing visibly. Often, the problem is more subtle. It advances, it produces, it seems coherent, but it drifts little by little. It fills gaps with assumptions. It rationalizes too quickly. It turns ambiguities into certainties. It confuses what’s practical with what’s right. It reduces real tensions to an average solution.

The danger isn’t always the obvious bug. The danger is sometimes the apparently clean, but misaligned solution.

And that’s precisely why the single agent becomes fragile as soon as the project gains in reality.


Specialization Is Not a Luxury, It’s a Reliability Mechanism

At this point, you need to flip the perspective. Many see specialization as a complication. In reality, it’s often a well-thought-out simplification.

Specializing doesn’t mean multiplying agents for the sake of it. It doesn’t mean building a Rube Goldberg machine where every micro-task is artificially separated. It doesn’t mean replacing one confusion with bureaucracy.

Specializing means recognizing that there are fundamental differences in the nature of different types of work.

An agent reasoning about architecture doesn’t need to approach the problem the same way as an agent reasoning about security. An agent preparing a package doesn’t need to behave like an agent inspecting functional coherence. An agent reviewing to identify risks shouldn’t be in exactly the same posture as the one that produced the code.

This separation brings something fundamental: it makes responsibilities clearer.

And when responsibilities are clearer, several benefits appear immediately. Context becomes more targeted. Expectations become more explicit. Outputs become more comparable. Gaps become more visible. Reasoning errors become easier to identify. Human review becomes more effective, because it doesn’t apply to a blurry mass, but to situated productions.

In short, specialization doesn’t primarily serve to produce more. It serves to produce more reliably.


A Good Agentic System Looks More Like a Team Than a Magic Wand

I believe this is the most useful image for thinking about the subject.

A good agentic system doesn’t look like a single oracle that would answer everything. It looks more like a work organization.

With roles.
With sequences.
With checkpoints.
With verifications.
With relay decisions.
With areas where you can parallelize.
With moments where you must, on the contrary, slow down and arbitrate.

This is quite logical, actually. Web development has never been simply a writing activity. It’s a coordination activity between sometimes contradictory requirements: speed, maintainability, security, readability, compatibility, cost, debt, robustness, delivery pressure. It would be strange for a mature AI system to ignore this reality and pretend to solve it in a single monolithic loop.

Thinking in specialized roles, on the contrary, allows you to embrace the real complexity of the job.

One agent can explore.
Another can structure.
Another can produce.
Another can verify.
Another can review from a specific angle.
Another can transform a validated decision into a clean deliverable.

The important point here isn’t to make lots of agents. The important point is to make a comprehensible work architecture emerge.

It’s not the number of agents that counts. It’s the quality of responsibilities and the transitions between them.


Handoffs Become a Real Engineering Topic

As soon as you accept that a single agent isn’t enough, another question appears immediately: how do you properly pass from one role to another?

That’s where an often underestimated topic comes into play: the handoff.

In a human team, handoffs are already critical. A bad transmission between product and engineering, between implementation and QA, between security and delivery, creates loss, misunderstandings, backtracking, reading errors. With agents, this problem doesn’t disappear. It simply changes form.

A good handoff must preserve the essentials without transmitting noise. It must specify what was done, what was decided, what remains open, what needs to be verified, what must not be questioned, and the level of confidence placed in the previous step.

In other words, a handoff is not just a text transfer. It’s a context structuring operation.

And that changes a lot. Because you then understand that a reliable agentic system depends not only on the intrinsic quality of each agent. It depends on the quality of the interfaces between them.

That’s often where maturity plays out. Not in the ability to do a lot, but in the ability to transmit just the right amount.


Why the Agent That Produces Shouldn’t Be the One That Validates

There’s a fairly simple rule in engineering: what produces shouldn’t be the sole judge of its own quality. Of course, self-control exists, and it’s useful. But it doesn’t replace validation from a different angle.

With agents, this rule remains true.

The agent that implements tends to implicitly defend the logic it just built. Even if it tries to review itself, it remains at least partially trapped in its own reasoning trajectory. It’s therefore vulnerable to the same blind spots as a human: post-hoc rationalization, excessive confidence in an earlier assumption, tendency to minimize a risk to preserve the coherence of the whole.

Separating production from validation precisely allows you to break this inertia.

A review agent doesn’t have the same objective as a build agent. A security agent doesn’t have the same reading grid as an architecture agent. An evaluation agent isn’t there to extend the solution, but to put it to the test.

This dissociation doesn’t make the system heavier by principle. It makes it more credible.

Because it reintroduces an essential principle: quality is not decreed within the same loop that produced the result.


The Problem Isn’t Having Multiple Agents, It’s Having Poorly Thought-Out Responsibilities

There’s sometimes a quick criticism of multi-agent approaches: “You’re overcomplicating everything. A good generalist agent with enough context would suffice.”

Sometimes, this criticism hits the mark. There are unnecessarily complex systems, with too many artificial roles, too many steps, too much ceremony. A bad multi-agent architecture can indeed become more fragile than a well-framed single agent.

But this observation doesn’t prove that the single agent is the right answer. It simply proves that responsibilities need to be designed seriously.

The topic is therefore not “one agent or several” in a purely numerical sense. The topic is: where should responsibility boundaries sit so that the system remains readable, reliable, and steerable?

Sometimes, a single loop is enough for a simple task. Sometimes, two or three well-defined roles already create a huge quality gain. Sometimes, a richer chain becomes necessary as soon as you touch security, compatibility, packaging, validation, or decision memory.

The right architecture isn’t the one that impresses the most. It’s the one that separates just what needs to be separated, and keeps together what can reasonably be kept together.


Mature Agentic AI Means Giving Up on Magic

I believe this is, at its core, the heart of the matter.

As long as you’re looking for magic, you naturally look for the single agent. A single point of contact, a single intelligence, a single flow. It’s elegant on the surface. It’s simple to tell. It’s seductive to sell.

But real web development rarely rewards magic. It rewards systems capable of absorbing complexity without denying it.

That’s why mature agentic AI looks less like a feat of strength and more like an architecture.

An architecture where each role has a reason to exist.
An architecture where handoffs are thought through.
An architecture where validations are separated from productions.
An architecture where context isn’t just accumulated, but intelligently distributed.
An architecture where the human retains real arbitration power.

It may not be the most spectacular vision. But it’s probably the most solid.


The Orchestrator Developer Emerges Right Here

It’s in this shift that the figure of the orchestrator developer becomes clearer.

If a single agent isn’t enough, then value is no longer in simply knowing how to “talk to AI.” It’s in the ability to organize a coherent work chain.

The orchestrator developer isn’t there to replace all roles by sheer will. They’re there to structure interactions between roles, choose the right levels of separation, inject the right context in the right place, demand the right checkpoints, and know when a production should be stopped, fixed, validated, or relaunched.

In other words, they become the guarantor of the system’s readability.

This isn’t a decorative role. It’s probably a growing part of technical value in the years to come.

Because a poorly structured agentic system can produce fast without producing right.
And because a well-thought-out agentic system can truly change a team’s quality level.

As I explained in I Stopped BMAD, the method matters as much as the tool. And orchestration is precisely that method.


Conclusion

The idea of the single agent is seductive, but it collapses as soon as web development returns to its true nature: a work of constraints, contexts, responsibilities, and verifications.

A single agent can be useful. It can even be impressive. But it isn’t enough as soon as you look for reliability, readability, and control on real projects.

That’s why specialization, handoffs, the separation between production and validation, and the structuring of roles become so important.

Agentic maturity doesn’t consist of dreaming of a super-agent capable of absorbing everything. It consists of building a work architecture where each role serves the overall quality.

And that’s precisely what opens the next logical step in this series.

Because if a single agent isn’t enough, then real performance depends on something else: the quality of context, skills, and method.

That’s what I’ll cover in the next article.


Coming Up in the Series

The Orchestrator Developer #1 — Why Agentic AI Truly Changes Web Development
The Orchestrator Developer #2 — Why a Single Agent Is Not Enough
The Orchestrator Developer #3 — Why Skills, Context, and Method Change Everything
The Orchestrator Developer #4 — The New Job: Frame, Orchestrate, Arbitrate


Nicolas Dabène — Architect of AI-native e-commerce transition & developer of the MCP Tools Plus module for PrestaShop. Convinced that agentic orchestration will be the competitive differentiator for tomorrow’s developers.