Sub-Agent Delegation

Sub-agents let a primary Penguin agent break large objectives into delegated tasks that can run with scoped permissions, tailored prompts, and isolated execution state.

Delegation Model

Primary agent receives a user instruction and determines that a supporting workflow is needed.
Sub-agent spawn happens via the core orchestration pipeline. Each sub-agent inherits the parent's system prompt, tools, and conversation metadata unless explicitly overridden.
Scoped execution ensures the sub-agent can only act within its delegated objective. Results are streamed back to the parent for evaluation.
Merge and respond: The parent agent inspects the sub-agent's output (and optional partial checkpoints) and incorporates it into the final reply.

Use Cases

Running long-lived analysis in parallel with a main dialogue
Executing read-only audits before the primary agent performs mutating actions
Enlisting specialized prompts (security reviewer, documentation writer) without swapping personas for the entire session

Capabilities

Shared Context

Sub-agents have access to:

Conversation history provided by the parent at time of spawn
Registered tools (file editing, shell access, web search, etc.)
Memory recall, including vector search results and semantic summaries

Scoped State

Checkpoints: Sub-agents can create checkpoints tagged with their identifier. Parents can choose to adopt or discard them.
Tokens and budgets: Each sub-agent run maintains its own token accounting, allowing strict budgeting without impacting the parent run.
Streaming callbacks: Streaming output from sub-agents is surfaced through the same event bus so UIs can display incremental progress.

Working with Sub-Agents

Python API

import asyncio

from penguin.api_client import ChatOptions, PenguinClient


async def research_and_write(prompt: str) -> str:
    async with PenguinClient() as client:
        parent_id = "primary"
        researcher_id = "research"

        # Ensure a base conversation for the parent agent
        cm = client.core.conversation_manager
        cm.create_agent_conversation(parent_id)

        # Create a sub-agent that inherits the parent's context window budget
        cm.create_sub_agent(
            researcher_id,
            parent_agent_id=parent_id,
            shared_context_window_max_tokens=512,
        )

        # Let the researcher gather information
        research_notes = await client.chat(
            prompt,
            options=ChatOptions(agent_id=researcher_id),
        )

        # Feed the findings back to the primary agent for synthesis
        return await client.chat(
            f"Summarize and refine: {research_notes}",
            options=ChatOptions(agent_id=parent_id),
        )


asyncio.run(research_and_write("Compile highlights from the latest changelog."))

Under the hood the conversation manager clones the parent's system and context state, optionally clamping context-window budgets so the delegated run cannot exceed agreed limits. Advanced setups can combine this with PenguinCore.register_agent to wire dedicated executors once the engine is running.

Need repeatable personas? Define them in config.yml under the agents: section with system prompts, default tools, and model overrides (including alternate providers such as OpenRouter). Pass persona="research" to PenguinCore.register_agent or client.create_sub_agent to pull those defaults in without re-specifying each field.

The CLI exposes these persona presets via penguin agent personas, and you can register or update agents with penguin agent spawn / penguin agent set-persona. The TUI mirrors the same affordances through /agent … commands so multi-agent rosters stay visible while you experiment.

ActionXML (Agents-as-Tools)

Sub-agents can also be managed directly from model output using ActionXML tags:

<spawn_sub_agent>{...}</spawn_sub_agent> – create a child (defaults to isolated session/CW). Supports id, parent, persona, system_prompt, share_session, share_context_window, shared_context_window_max_tokens, model_config_id or model_overrides, default_tools, and an optional initial_prompt.
<stop_sub_agent>{"id": "child"}</stop_sub_agent> – pause a child (engine-driven loops should skip work).
<resume_sub_agent>{"id": "child"}</resume_sub_agent> – resume a paused child.
<delegate>{"parent": "default", "child": "child", "content": "…", "channel": "dev-room"}</delegate> – send a message to a child and record a delegation event (includes channel).

See penguin/prompt_actions.py for the full syntax and examples.

CLI Quick Reference

List agents:
- penguin agent list (table)
- penguin agent list --json (script-friendly)
Spawn sub-agent:
- penguin agent spawn child --parent default --isolate-session --isolate-context [--persona research] [--model-id kimi-lite]
Pause/Resume:
- penguin agent pause child
- penguin agent resume child
Agent info:
- penguin agent info child --json
REST convenience:
- POST /api/v1/agents to spawn (parent optional)
- POST /api/v1/agents/{id}/delegate to route work with channel metadata
- PATCH /api/v1/agents/{id} with { "paused": true|false }

Live Script Example

The repository includes scripts/phaseD_live_sub_agent_demo.py, a Python client demo that spawns two sub-agents, runs focused prompts through the engine, and prints conversation summaries. Run it with:

uv run python scripts/phaseD_live_sub_agent_demo.py

It respects the same model requirements documented above (defaulting to the OpenRouter Moonshot model). Use this as a template for richer experiments or integration tests.

Message Flow & Ordering

Shared transport: Parent and sub-agents use the same MessageBus fabric as top-level personas. Registering a sub-agent wires an inbox handler for that agent_id; core.send_to_agent(...) simply enqueues events for that handler.
Event-driven: Delegates operate asynchronously. Parents send work, then consume events (stream chunks, action results, summaries) as they arrive. There is no blocking RPC; instead, the conversation manager records every message with agent_id, recipient_id, and timestamps so parents can reconstruct the full timeline.
Ordering guarantees: Each agent processes its own queue sequentially—tool output emitted by a delegate arrives in-order for that delegate. When multiple agents talk on the same room/channel, interleaving is determined by send time; rely on the recorded timestamps and channel metadata to understand flow.
Result merging: Parents typically read the delegate’s conversation history or listen on the shared channel to decide how to respond. For deterministic handoffs, write shared artifacts (e.g., context/TASK_CHARTER.md) so every participant reads the same source of truth before continuing.
Synchronous needs: When a parent must wait for a specific completion signal, have the delegate post a sentinel message (e.g., status=ready) or update the shared charter/status file—parents can watch for that condition before proceeding.

REST and WebSocket

Today, REST and WebSocket interfaces expose the agent_id routing parameter. Sub-agent orchestration occurs through the core APIs shown above; API-level payloads for sub-agent creation are on the roadmap and will follow the same intent-but be explicit about that being future work.

Best Practices

Keep scopes tight: Sub-agents should have a singular, well-defined objective. Broad scopes reduce determinism.
Budget tokens: Supply explicit limits when spawning analysis-heavy sub-agents to avoid runaway costs.
Audit results: Treat sub-agent output as suggestions; validate before enacting irreversible changes.
Instrument: Include sub-agent identifiers in your telemetry so you can monitor success rates and latency.

Roadmap

First-class CLI commands for configuring sub-agent templates
Adaptive delegation heuristics that decide when to spawn sub-agents automatically
Fine-grained permission profiles per sub-agent (read-only vs. write access)
Visualizations in the web UI showing delegation trees and progress

Need to coordinate multiple top-level personas instead? Check out Multi-Agent Orchestration.

Delegation Model​

Use Cases​

Capabilities​

Shared Context​

Scoped State​

Working with Sub-Agents​

Python API​

ActionXML (Agents-as-Tools)​

CLI Quick Reference​

Live Script Example​

Message Flow & Ordering​

REST and WebSocket​

Best Practices​

Roadmap​