Multi-Agent Orchestration: Supervisor, Pipeline, and Mesh Patterns
How to break a complex task across multiple agents. Supervisor vs. pipeline vs. mesh patterns, inter-agent communication, and failure budgets.
Multi-agent systems get hyped as the path to AGI and dismissed as unnecessary complexity. The truth: they're the right tool for a narrow set of problems, and the wrong tool for most of what people use them for. Here's how to tell the difference.
When Multi-Agent Actually Helps
- Parallelism: Tasks that can run concurrently (research 5 topics simultaneously) benefit enormously. One agent is inherently sequential.
- Specialization: Domain-specific agents (legal, financial, code) that each have tailored tools, context, and prompts outperform a generalist agent on complex domains.
- Scale beyond context: Tasks that exceed a single context window (process 1000 documents) need multiple agents working across the corpus.
- Verification: A separate critic/verifier agent reviewing another agent's output catches more errors than self-reflection.
The Three Patterns
| Pattern | Structure | Best For | Complexity |
|---|---|---|---|
| Orchestrator-Worker | One planner dispatches tasks to specialized workers | Parallelizable tasks, domain specialization | Medium |
| Peer-to-Peer | Agents communicate directly, no central coordinator | Collaborative writing, debate/critique | High |
| Hierarchical | Multiple levels: planner → sub-planners → workers | Very complex tasks, enterprise workflows | Very High |
Orchestrator-Worker: The Default Choice
Start here. The orchestrator receives the task, breaks it into subtasks, dispatches to workers, collects results, and synthesizes. Workers are stateless — they receive a task, execute it, return a result. This is easy to debug, easy to scale, and easy to reason about. The orchestrator's prompt is the hardest part to write: it needs to decompose tasks well and know when to retry vs give up.
Communication: A2A vs Shared Memory
Two ways agents coordinate: A2A (agent-to-agent) direct messaging (Google's A2A protocol, custom message passing) or shared memory (all agents read/write to a common state store). Shared memory is simpler but creates race conditions. A2A is cleaner but requires explicit message schemas. For most teams: shared memory with optimistic locking is the pragmatic choice. A2A shines when agents run across different services or organizations.
What Actually Goes Wrong
- Error propagation: one worker fails, orchestrator doesn't handle it, whole pipeline returns garbage. Build explicit error contracts between agents.
- Cost explosion: 10 parallel agents × 10 steps each = 100 LLM calls. Budget and monitor before running at scale.
- Coordination overhead: orchestrators that decompose too granularly spend more tokens on coordination than the workers spend on work.
- Debugging nightmare: tracing a failure across 5 agents without good observability is hours of work. Instrument before you build.
Honest take: most tasks that teams try to solve with multi-agent systems can be solved with a single well-structured agent + good tooling. Multi-agent adds value only when you genuinely need parallelism, specialization, or scale beyond one context window.
Try it interactively
GenAI Systems Lab is a free platform for AI engineers — configure real failure modes, break things, and build the judgment that gets you hired.
Open GenAI Systems Lab →