Introduction
Twelve months ago, almost every enterprise deployment we reviewed was a single assistant with a sprawling system prompt and ambitions to do everything from procurement to legal review. Today, the rooms we sit in have grown quieter, and the diagrams on the whiteboards have grown smaller.
By the numbers
64p
Median cost per answer, March 2026
8 → 80
Agents per deployment, twelve-month span
4x
Faster time-to-production for narrow scopes
One assistant, one job
The shift is not driven by fashion. It is driven by the three things that keep enterprise platform leads awake at night — evaluation, cost, and the awkward question of who is accountable when the model gets it wrong. Narrow agents help with all three at once, and the compromises are easier to articulate to a board.
A team running eight focused agents can tell you, honestly, which two of them are saving money and which six are merely interesting. The same team running one general-purpose assistant usually cannot.
| Scope | Agents | Median latency | Pass rate |
|---|---|---|---|
| Intake triage | 3 | 410 ms | 96.4% |
| Knowledge retrieval | 2 | 720 ms | 92.1% |
| Compliance classify | 1 | 290 ms | 98.7% |
| Escalation route | 2 | 180 ms | 99.2% |
What the orchestrator actually looks like
We hand out a tiny scheduler, not a framework. Four callables, a typed registry, and a policy that can say no. Most of the interesting questions then happen at the edges, which is where you want them.
export async function route(
req: AgentRequest,
registry: AgentRegistry,
): Promise<AgentResult> {
const plan = await policy.decide(req);
if (plan.kind === "refuse") return { ok: false, reason: plan.reason };
const agent = registry.resolve(plan.agentId);
const result = await agent.run(req);
await telemetry.record({ req, plan, result });
return result;
}What we are watching next
The interesting question for 2026 is not whether specialised agents work — they do ¹ — but what the connective tissue between them looks like when the list grows from eight to eighty ². That is the problem we are currently sitting with, and the answer has already stopped looking like a product and started looking like an operating model ³.
What the data shows
Cost per useful answer, twelve-month trailing
Agent time to production, by scope
We used to have a model problem. Now we have an orchestration problem, which turns out to be a better problem to have.
Where we land
We will keep writing these as we find them. If any of this lands close to a problem you are working on, the team is always happy to talk it through.