Multi-Agent Orchestration — The Essence of Routing Design
When running multiple AI agents like Claude and Codex, task routing is the hardest challenge — and it mirrors delegation in engineering management.
Overview
The era of running multiple AI agents simultaneously has arrived. Claude excels at judgment and context understanding, while Codex shines at precise code generation. But the hardest problem is routing — deciding which agent should handle which task.
In this article, I examine multi-agent routing from an Engineering Manager’s (EM) perspective, arguing that it shares the same structure as delegation in people management.
Why Routing Is the Hardest Challenge
The Limits of a Single Agent
Assigning everything to one agent leads to context window overflow, lack of specialization, and response delays. This naturally drives us to split agents by domain expertise.
The Real Problem After Splitting
Splitting agents isn’t the hard part. The real challenges are:
- Task classification ambiguity: “Is this PR review about code quality or architecture judgment?”
- Context transfer costs: Information loss when passing context between agents
- Failure re-routing: Fallback strategies when an agent fails
- Dependency management: Pipeline design where A’s output feeds B’s input
graph TD
Task[New Task] --> Router{Routing Engine}
Router -->|Judgment/Context| Claude[Claude Agent]
Router -->|Code Generation| Codex[Codex Agent]
Router -->|Information Search| Search[Search Agent]
Router -->|Unclear| Fallback[Fallback: Delegate to Claude]
Claude --> Merge[Merge Results]
Codex --> Merge
Search --> Merge
Fallback --> Router
The Same Structure as EM Delegation
A Manager’s Daily Routine
Think about what an Engineering Manager does every day:
| EM’s Decision | Agent Routing |
|---|---|
| ”Feature implementation → Engineer A" | "Code generation → Codex" |
| "Architecture review → Engineer B" | "Design judgment → Claude" |
| "Simple bug fix → Junior dev" | "Simple task → Lightweight model" |
| "Ambiguous? I’ll handle it" | "Unclear? Orchestrator handles it” |
Three Levels of Delegation
Applying a delegation framework from EM experience to agents:
graph TB
subgraph "Level 1: Full Delegation"
L1[Clear I/O<br/>Unit test writing, format conversion]
end
subgraph "Level 2: Guided Delegation"
L2[Direction + autonomous execution<br/>PR review, refactoring]
end
subgraph "Level 3: Collaborative Execution"
L3[Orchestrator intervenes<br/>Architecture decisions, tradeoff judgment]
end
L1 --> L2 --> L3
Level 1 — Full Delegation: Tasks with clear inputs and outputs. Unit test generation, JSON format conversion. Just throw these at Codex.
Level 2 — Guided Delegation: Set the direction but let the agent handle the specifics. PR reviews, code refactoring. Claude writes the guidelines, Codex executes.
Level 3 — Collaborative Execution: Tasks where the orchestrator itself must be deeply involved. Architecture decisions, technology choices.
Routing Design Lessons from Real Cases
okash1n’s Claude Code + Codex MCP Setup
okash1n (super_bonochin) shared a setup connecting Codex to Claude Code via MCP. The key insight:
- Claude Code serves as the orchestrator, managing the overall flow
- Codex acts as an MCP server, specializing in code generation
- When Claude determines “this is a code generation task,” it delegates to Codex
This is exactly the structure of an EM (Claude) delegating implementation to a senior engineer (Codex).
NabbilKhan’s 8-Agent Operation
NabbilKhan demonstrated running 8 agents simultaneously. The biggest problem? Routing:
- The judgment cost of deciding “which of 8 agents handles this task”
- Splitting strategy when a task spans multiple agents’ domains
- The difficulty of synchronizing context across agents
This is exactly what an EM managing 8 engineers faces.
Core Principles of Routing Design
1. Clear Role Definition (Role Boundary)
Document each agent’s scope of responsibility clearly — like writing a Job Description.
# agents/codex.yaml
name: Codex Agent
role: Code generation specialist
capabilities:
- Function/class implementation
- Unit test writing
- Refactoring execution
boundaries:
- No architecture decisions
- No external API design
escalation: Escalate to Claude Agent
2. Explicit Routing Criteria
graph LR
Input[Task Input] --> Classify{Classify}
Classify -->|Code Change| CodeCheck{Change Scope}
CodeCheck -->|Single File| Codex[Codex]
CodeCheck -->|Multiple Files| Claude[Claude Review → Codex]
Classify -->|Judgment Needed| Claude2[Claude]
Classify -->|Info Gathering| Search[Search Agent]
3. Escalation Paths for Failures
Just as a stuck team member escalates to their manager, a failing agent escalates to the orchestrator.
async def route_task(task: Task) -> Result:
agent = classify(task)
result = await agent.execute(task)
if result.confidence < 0.7:
# Escalation: orchestrator handles directly
return await orchestrator.handle(task, context=result)
return result
4. Feedback Loops for Routing Improvement
Just as a manager adjusts future delegation based on results, agent routing should improve based on outcomes:
- Track per-agent success/failure rates
- Identify patterns that frequently trigger re-routing
- Progressively refine routing rules
Conclusion
The essence of multi-agent orchestration isn’t technology — it’s design philosophy. Just as an EM distributes work to team members, we distribute tasks to agents. The core principles:
- Define role boundaries clearly — Agent responsibilities like Job Descriptions
- Distinguish delegation levels — Full / Guided / Collaborative
- Design escalation paths — Prepare fallbacks for failure scenarios
- Continuously improve via feedback — Track routing outcomes and refine rules
Ultimately, just as a great manager builds a great team, a great orchestrator builds a great agent system.
References
Was this helpful?
Your support helps me create better content. Buy me a coffee! ☕