AI Summary Hub

Subagents

Hierarchical agents and delegation.

Definition

Subagents are agents that sit within a hierarchy: a parent agent delegates sub-tasks to child agents (subagents), which may in turn delegate to further subagents. This hierarchical structure keeps each agent focused on a narrow, well-defined responsibility rather than attempting to handle everything in a single loop.

They are one way to implement multi-agent systems with a clear chain of responsibility and ownership. The root agent owns the user-facing goal and is responsible for the final answer; subagents handle focused sub-tasks such as retrieval, code execution, validation, or formatting. The root agent coordinates timing, aggregates results, and decides when to retry or escalate.

Often used with spec-driven development or RDD so subagents receive explicit, testable specifications for their outputs. The subagent pattern scales naturally: as a workflow grows more complex, new subagents can be added for new responsibilities without restructuring the root agent's logic.

How it works

Hierarchical delegation

Subagent internal loop

The root agent receives the task, breaks it into sub-tasks, and assigns them to Subagent 1, Subagent 2, etc. (by role or capability). Each subagent runs its own loop (possibly with tools and an LLM) and returns results to the root. The root aggregates results (e.g. merges, selects, or passes to another subagent) and either continues the loop or returns to the user. Subagents can be specialized (e.g. retrieval, code, critique) and use the same or different models. Clear contracts (inputs/outputs or tool schemas) and error handling make the hierarchy debuggable and reusable.

When to use / When NOT to use

ScenarioUse subagentsDon't use subagents
Task decomposes into parallel independent sub-tasksYes — subagents can run concurrentlyNo — if sub-tasks are tightly coupled and sequential, the root can handle them inline
Reusing the same capability across workflowsYes — same subagent can be called from different rootsNo — one-off tasks don't benefit from the abstraction
Sub-tasks require different tools or modelsYes — each subagent can have its own configurationNo — if one model with all tools is sufficient
Debugging and testing individual subtask logicYes — subagents have clear inputs/outputs, easy to unit-testNo — if the task is simple enough to test end-to-end
Rapid prototyping or simple workflowsNo — hierarchy adds coordination overheadYes — a single agent loop is simpler and faster to iterate on

Comparisons

ApproachStructureDelegationRe-usabilityDebugging
Single agentFlat loopNoneLowTrace one loop
Multi-agent (peer)Flat / meshPeer-to-peerMediumTrace multiple loops
Subagent (hierarchical)TreeRoot → childrenHighTrace per level
Pipeline / chainSequentialStep-to-stepMediumStep output inspection

Pros and cons

ProsCons
Clear separation of concerns — each subagent does one thingCoordination overhead (latency, token cost, state passing)
Scalable — add new subagents for new responsibilitiesNeed clear input/output contracts and error handling
Reusable — same subagent plugged into different root workflowsDebugging across hierarchy levels can be complex
Root agent logic stays clean and high-levelMultiple LLM calls increase total cost

Code examples

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage

llm = ChatOpenAI(model="gpt-4o-mini")

# --- Define subagents ---

def retrieval_subagent(query: str) -> str:
    """Retrieve relevant context (simulated here)."""
    # In production: query a vector store
    return f"[Context for '{query}': RAG stands for Retrieval-Augmented Generation...]"

def generation_subagent(query: str, context: str) -> str:
    """Generate an answer given context."""
    response = llm.invoke([
        SystemMessage(content="Answer the question using only the provided context."),
        HumanMessage(content=f"Context:\n{context}\n\nQuestion: {query}"),
    ])
    return response.content

def validation_subagent(answer: str, context: str) -> str:
    """Check if the answer is grounded in the context."""
    response = llm.invoke([
        SystemMessage(content="Check if the answer is fully supported by the context. Reply PASS or FAIL with a reason."),
        HumanMessage(content=f"Context:\n{context}\n\nAnswer:\n{answer}"),
    ])
    return response.content

# --- Root agent orchestrates ---

def root_agent(user_query: str) -> str:
    context = retrieval_subagent(user_query)
    draft = generation_subagent(user_query, context)
    validation = validation_subagent(draft, context)

    if "FAIL" in validation.upper():
        # Retry once with explicit instruction
        draft = generation_subagent(
            user_query + " (be precise and grounded)",
            context,
        )

    return draft

print(root_agent("What is RAG?"))

Practical resources

See also