
The biggest friction in multi-agent AI systems just got solved. Asynchronous subagents — now live in Hermes Agent from Nous Research — let a parent agent delegate long-running tasks to child agents that run entirely in the background, returning control to the chat immediately so you can keep working while the work gets done.
If you have used Hermes Agent before and been forced to stare at a frozen chat window while a subagent crawled through a complex task, this update changes everything about how delegated work fits into your workflow.
What Are Asynchronous Subagents?
Definition: Asynchronous subagents are isolated child agents that a parent agent spawns to complete delegated tasks, but that run independently in the background without blocking the parent’s conversation thread.
In traditional (synchronous) agent architectures, the parent agent is essentially paused the moment it delegates a task. It sits idle inside the tool call, waiting for the child to finish, before the conversation can resume. This is the “blocking” problem — and it makes long-running delegation unusable in practice.
Asynchronous subagents invert that contract. The parent agent delegates via a tool call, receives a task_id immediately, and is free to continue chatting, issuing new commands, or spawning additional tasks. The child agent runs independently. The parent checks in, steers, or collects the result whenever it makes sense — not because it is forced to wait.
This shifts multi-agent orchestration from a sequential, wait-and-see model to a genuinely concurrent one.
How Hermes Agent Implemented Asynchronous Subagents
Nous Research’s Hermes Agent is an open-source personal agent built around a parent-child delegation model. A parent agent can spawn child agents — called subagents — to fan out work across parallel tasks. Until June 2026, that delegation used the delegate_task tool synchronously: the parent blocked inside the tool call until every child completed, freezing the chat for the duration.
The update, announced by Nous Research and co-founder Teknium on June 15, 2026, ships an entirely new async_delegation toolset (tracked in GitHub Issue #5586). It is available now to all existing users via hermes update.
The architectural change is straightforward in principle: instead of the parent waiting inside the tool call, background agents are launched as in-process threads. They reuse the same AIAgent machinery, credentials, and toolsets as the synchronous delegate_task — but they run without holding the parent’s conversation hostage.
The async_delegation Toolset — Six Tools Explained
The new toolset covers the complete lifecycle of an asynchronous subagent from spawn to collection:
delegate_task_async— Spawns a background subagent and returns atask_idimmediately, freeing the parent to continue.check_task— A non-blocking status poll that returns the current state and recent output of a running task.steer_task— Injects a new message into a running subagent, letting you course-correct mid-execution without stopping the task.collect_task— Blocks until a specific task completes, then returns the full result. Use this when you are ready to merge the output.cancel_task— Stops a running background task immediately.list_tasks— Returns all async tasks active in the current session.
The combination of these six tools gives you granular control over asynchronous subagents at every stage: spawn, monitor, redirect, collect, and cancel. No prior agentic framework available in Hermes offered this full lifecycle control for delegated tasks.
Synchronous vs Asynchronous Delegation — Side-by-Side Comparison
Understanding the difference between the old and new delegation models is critical for designing effective agentic workflows. Here is a direct comparison across every dimension that matters:
| Dimension | Synchronous delegate_task | Asynchronous async_delegation |
|---|---|---|
| Parent chat behavior | Blocks until all children finish | Returns a task_id immediately; chat stays free |
| Mid-run control | None — you wait | Check status, steer, collect, or cancel per task |
| Execution model | Parent waits inside the tool call | Background in-process threads |
| Context cost | Only the final summary returns to parent | Only the final summary returns to parent |
| Child isolation | Fresh conversation per child | Fresh conversation per child |
| Best for | Short fan-out tasks you can afford to wait on | Long tasks you want to run alongside the chat |
| Cross-session durability | Not durable | Single-session (ACP targets cross-turn durability) |
| How to trigger | delegate_task | delegate_task_async + lifecycle tools |
The context cost behavior is identical between both models: only the subagent’s final summary returns to the parent’s context window. The child’s intermediate tool calls, reasoning steps, and scratchpad work are discarded. This is by design — it keeps the parent’s context lean even when children do extensive work.
Why Asynchronous Subagents Matter for Agentic AI Workflows
The synchronous blocking problem was not a minor inconvenience — it was a fundamental architectural constraint that prevented whole categories of multi-agent use cases from being practical. Here is what asynchronous subagents change concretely:
- Concurrent delegation is now real. You can spawn multiple asynchronous subagents in parallel and let them work simultaneously rather than sequentially, collapsing total task time dramatically.
- Long-running tasks become first-class citizens. Tasks that take minutes or hours — deep research, large-scale file processing, extended web crawls — can now run in the background without holding your conversation hostage.
- Mid-flight steering is possible. The
steer_tasktool means you can monitor a subagent’s progress and redirect it if it is heading in the wrong direction, without cancelling and restarting. - Cost optimization through model routing. Subagents can be routed to cheaper models via
config.yaml. Async execution makes this more practical because you are no longer waiting on-screen for every cheaper-model call to complete. - Workflow composability improves. Complex orchestrations — where one group of subtasks must complete before a second group begins — become expressible as collect-then-spawn patterns rather than rigid sequential chains.
- User experience is fundamentally better. Waiting for a frozen chat is cognitively disruptive. Async delegation means the parent agent can keep you informed, updated, and productive while background work proceeds.
Real-World Use Cases Unlocked by Asynchronous Subagents
Long-Running Research Tasks
Before asynchronous subagents, you could not realistically ask Hermes Agent to delegate a deep research task — say, surveying 50 sources on a topic, summarizing each, and producing a comparative synthesis — without accepting that the chat would be frozen for the entire duration.
With async delegation, you spawn that research subagent with delegate_task_async, continue chatting or issuing other instructions, and call collect_task when you are ready to review the final synthesis. You can even use check_task periodically to see how far along it is and steer_task to narrow or redirect the research direction mid-run.
Parallel Code Review and Testing
A common agentic coding pattern is to split a codebase into modules and review each in parallel. In the synchronous model, this meant reviewing them sequentially — module A blocks until done, then module B, and so on.
With asynchronous subagents, you spawn a subagent per module simultaneously, letting each run in its own isolated thread. You collect results when all are done, cutting review time proportionally to the number of parallel tasks. The list_tasks tool lets you track the status of every running review at a glance.
Multi-Model Cost Optimization
Hermes Agent allows subagent model routing through config.yaml. Expensive frontier models are often overkill for sub-tasks like reformatting, summarizing structured data, or extracting entities from text. Routing those subtasks to cheaper models is sensible — but in the synchronous model, the savings were offset by the friction of watching slower model calls complete on-screen.
Async delegation removes that friction entirely. You spawn cost-optimized subagents in the background, route them to the appropriate model tier, and collect results when they are ready. The parent agent running on a more capable model stays responsive and can tackle higher-complexity tasks in the meantime.
How Context Isolation Keeps Parent Agents Efficient
One of Hermes Agent’s most important design decisions — both for synchronous and asynchronous subagents — is strict context isolation between parent and child.
Each subagent starts with a completely fresh conversation. It has no knowledge of the parent’s history, previous tool calls, or chain of reasoning. The parent must pass all relevant context explicitly through the goal and context fields when spawning the child. Only the child’s final summary is returned to the parent.
This isolation serves two purposes. First, it keeps the parent’s context window from growing linearly with the work its children do. A parent agent that spawns ten research subagents, each doing hundreds of tool calls, would have an unmanageable context if those calls flowed back upstream. The summary-only return pattern prevents this.
Second, it creates clean separation of concerns. Each child is a purpose-built, single-task agent operating with the minimal context it needs. This makes subagents more predictable and easier to debug, since their behavior is not influenced by accumulated parent-level reasoning.
The tradeoff is that the parent must be explicit about what the child needs. Vague goal fields or missing context will produce vague subagent results. Designing effective async workflows in Hermes Agent requires intentional context passing — treating the child spawn as a formal handoff document, not a casual delegation.
Subagent Isolation: What Children Inherit and What They Do Not
What a subagent inherits from its parent:
- The parent’s API key
- The provider configuration (model, endpoint)
- The credential pool (which enables key rotation on rate limits)
- The configured toolset
What a subagent does NOT inherit:
- The parent’s conversation history
- The parent’s in-context reasoning
- Any intermediate tool call results from the parent’s session
- Any memory of previous tasks from prior turns
This inheritance model means asynchronous subagents are stateless with respect to parent context — deliberately so. It also means credential management is seamless: you do not need to configure children independently, and key rotation across rate-limited providers works automatically.
What Comes Next — ACP and Cross-Session Durability
The current implementation of asynchronous subagents has one significant limitation worth understanding: it is single-session only. Background agents run as in-process threads, which means they do not survive across conversation turns or if the session ends.
Nous Research is already working on the next step. GitHub Issue #4949 tracks the Agent Communication Protocol (ACP), which targets cross-turn and cross-session durability for delegated tasks. ACP would allow a subagent spawned in one session to continue working and be collected in a future session — making truly long-horizon delegation possible.
Until ACP ships, the right mental model for asynchronous subagents in Hermes is: background concurrency within a session, not persistent jobs across sessions. For tasks that fit within a single working session, the current implementation is highly capable. For tasks that need to survive overnight or across restarts, ACP will be the enabling feature.
Getting Started with Asynchronous Subagents in Hermes Agent
Enabling asynchronous subagents requires no configuration changes beyond updating Hermes Agent to the latest version.
For existing users, the update command is simply: hermes update
After updating, the async_delegation toolset is available alongside the existing delegate_task tool. Both synchronous and asynchronous delegation patterns remain supported, so you can use synchronous delegation for quick fan-out tasks and reserve async delegation for workflows where you want to remain productive while the work runs.
Recommended approach for your first async workflow:
- Identify a task in your current workflow that causes long blocking waits when delegated
- Spawn that task using
delegate_task_asyncinstead ofdelegate_task - Note the returned
task_id - Use
check_taskto monitor progress on a schedule that makes sense for the task length - Issue
steer_taskif early output suggests a course correction is needed - Call
collect_taskwhen you are ready to use the result
If you are routing subagents to cheaper models, confirm your config.yaml model routing is set up before spawning async tasks. The credential pool is inherited automatically, so API key configuration needs no changes.
Why This Update Represents a Genuine Architectural Shift
Most feature updates in the agentic AI space are incremental — new tools, new model integrations, new context window expansions. Asynchronous subagents are different because they change the fundamental execution model of multi-agent delegation.
Synchronous delegation is conceptually simple but practically limiting. It forces multi-agent workflows into a sequential, blocking structure that scales poorly with task complexity and duration. The longer the delegated tasks, the more painful the blocking becomes, and the less practically useful the delegation model is for real-world workloads.
Asynchronous subagents break that constraint. They enable concurrent, non-blocking, steerable delegation that scales to long-running tasks without degrading the user experience. The six-tool lifecycle API (delegate_task_async, check_task, steer_task, collect_task, cancel_task, list_tasks) gives developers the full control surface they need to build sophisticated orchestration patterns on top of this primitive.
For the broader agentic AI ecosystem, Hermes Agent’s implementation is a practical demonstration that async delegation does not require exotic infrastructure. Background in-process threads, careful context isolation, and a clean tool API are sufficient to deliver the feature at production quality. That simplicity may prove influential.
Frequently Asked Questions
What are asynchronous subagents in Hermes Agent? Asynchronous subagents are background child agents spawned by a parent agent in Hermes. They run independently without blocking the parent’s chat, returning control immediately so the parent can continue working while the child task executes.
How do asynchronous subagents differ from the existing delegate_task tool? The original delegate_task is synchronous — the parent freezes inside the tool call until all children complete. Asynchronous subagents via delegate_task_async return a task_id immediately and run in background threads, with a six-tool lifecycle API for monitoring, steering, and collecting results.
Do I need to reconfigure my API keys for asynchronous subagents? No. Subagents inherit the parent’s API key and credential pool automatically, including key rotation behavior on rate limits.
Can asynchronous subagents run across sessions or restarts? Not yet. The current implementation runs within a single session as in-process threads. Cross-session durability is planned through the Agent Communication Protocol (ACP), tracked in GitHub Issue #4949.
How do I enable asynchronous subagents in Hermes? Run hermes update to pull the latest version. No additional configuration is required — the async_delegation toolset is available immediately after updating.