Some tasks are embarrassingly parallel. Researching 5 competitors, checking 3 APIs, or reviewing code across multiple repositories. Running these sequentially wastes time. Foxl's multi-agent system runs them simultaneously.

How it works
When the agent determines subtasks can run independently, it spawns subagents using the sessions_spawn tool. Each subagent inherits the parent's full conversation context (for prompt cache sharing) and runs autonomously with its own tool access.
Up to 5 subagents run concurrently. Each subagent's progress is broadcast via WebSocket events, so you see real-time updates in the sidebar - including elapsed time and a stop button for each agent.
The auto-continuation problem
The tricky part is what happens when subagents complete. The parent agent needs to synthesize their results into a coherent response. In earlier versions, this was handled by two competing in-memory queues - one in the subagent module, one in the server. Race conditions between them could trigger duplicate auto-continuations, causing the parent to spawn duplicate agents (3 requested, 6 created).
DB-backed solution (v0.2.0)
We replaced the in-memory queues with SQLite tables: auto_continue_queue stores pending results, and auto_continue_lock serializes drain execution with an atomic INSERT OR IGNORE mutex. When a subagent completes:
- Its result is written to the DB queue
- The server tries to acquire the drain lock
- If acquired, it drains all pending results and synthesizes them
- If the lock is already held, the result waits in the queue
- When the current drain finishes, it releases the lock and checks again
This eliminates race conditions entirely. The lock is also used as the spawn guard - while auto-continuation is active, sessions_spawn checks isAutoContinuing() and refuses to create new agents. The parent must synthesize what it has.
Fork and cache sharing
Each subagent receives the parent's full message history as a prefix. This means the first ~90% of tokens in the prompt are byte-identical across all subagents, enabling prompt cache hits on the AI provider side. In practice, this cuts cost and latency for multi-agent tasks significantly.
Read more in our subagents documentation.