Agent fleet consolidation
What this page is
The shape of the consolidated specialist pool after ADR 004. Covers the agent registry, the orchestrator's dispatch path, and the per-agent contract that downstream tools and the operator console rely on.
Why it exists this way
Before consolidation, the agent fleet was a mix of
hand-registered Python classes and shell-launched processes
that disagreed on lifecycle (some restarted on crash, some did
not), on logging shape (structured vs not), and on policy gate
(some checked the system mode, some did not). ADR 004 picked a
single in-process BeeAI Agent subclass per specialist and
moved orchestration into aurorasoc.workflows.investigation.
How it works
The registry lives at
packages/backend/aurorasoc/agents/factory.py.
AgentFactory is the only API the rest of the backend uses
to construct an agent; it reads the registry, resolves the
inference backend per ADR 005, wires
the per-agent tool list, and returns a configured BeeAI
Agent ready to serve task dispatches.
Nine specialists ship in this configuration:
orchestrator, dispatch router; does not call tools itself.threat-hunter, Sigma and IOC pivots; the canary path in the LLM evaluation harness wires through this agent first.malware-analyst, sandbox dispatch, YARA scanning.incident-responder, playbook execution under the approval gate.network-security, Suricata/Zeek triage and IDS sig evaluation.cps-security, OT attestation and Modbus/MQTT triage.threat-intel, IOC enrichment and feed reconciliation.endpoint-security, EDR action dispatch and live triage.forensic-analyst, evidence collection and timeline reconstruction.
Every agent's startup is checkpointed by
aurorasoc.repositories.investigation_repository so a worker
restart does not lose the in-flight investigation.
What goes wrong
- A specialist's required tools are unavailable (the underlying
MCP server is down). The factory returns the agent in a
degraded state with the typed
AgentInitError::MissingToolsso the orchestrator can skip the agent for the current dispatch rather than block. - The inference backend disagreement (vLLM healthy, Ollama fallback returns an empty completion). The handling is owned by Inference backend resolution; agents themselves are agnostic and surface the typed error upstream.