Skip to main content

LLM Integration Architecture

This page explains how IBM Granite 4 models are wired into AuroraSOC's multi-agent system — from the factory that creates agents, through the Granite module that resolves models, to the serving backends that run inference.

High-Level Architecture

Component Responsibilities

ComponentFilePurpose
AuroraAgentFactoryaurorasoc/agents/factory.pyCreates all 17 BeeAI agents (orchestrator + 16 specialists) with correct LLMs, tools, and prompts
Granite Moduleaurorasoc/granite/__init__.pyModel resolution, configuration, and ChatModel creation
Model Registryaurorasoc/granite/registry.pyHealth checks, model availability, warmup
Agent Promptsaurorasoc/agents/prompts.pySystem prompts for each agent persona
BeeAI Frameworkbeeai-frameworkRequirementAgent, ChatModel, tools, middleware

The Request Flow

When an alert arrives in AuroraSOC, here's the complete flow from event to LLM response:

Step-by-step:

  1. Event ingestion: An alert arrives via MQTT or NATS JetStream
  2. Normalization: The event pipeline normalizes the alert into a standard format
  3. Dispatch: The API layer calls the factory to handle the alert
  4. Agent selection: The factory determines which specialist agent should handle this alert type
  5. Model resolution: _llm_for(agent_name) calls the Granite module to resolve the correct model
  6. ChatModel creation: The Granite module returns a BeeAI ChatModel instance configured for the correct backend
  7. Agent creation: The factory creates a RequirementAgent with the model, tools, and system prompt
  8. Inference: The agent uses the ChatModel to query Ollama or vLLM
  9. Response: The agent returns structured analysis back to the API layer

The Factory: AuroraAgentFactory

The factory (aurorasoc/agents/factory.py) is the central point where agents get their LLMs. Specialist definitions are declared in AGENT_SPECS, then instantiated through a generic create_agent() path.

Factory Initialization

class AuroraAgentFactory:
def __init__(self, model_name: str | None = None,
granite_config: GraniteModelConfig | None = None):
self.granite_config = granite_config or get_default_granite_config()
if model_name:
self.granite_config.model_name = model_name

The factory accepts an optional model_name (for manual override) and an optional GraniteModelConfig. If neither is provided, it falls back to get_default_granite_config() which reads from environment variables.

The _llm_for() Method

This is the critical bridge between agents and models:

def _llm_for(self, agent_name: str) -> ChatModel:
"""Resolve the appropriate ChatModel for a given agent."""
return create_granite_chat_model(
config=self.granite_config,
agent_name=agent_name,
)

Every create_*() method calls _llm_for() to get its model:

def create_threat_hunter(self) -> RequirementAgent:
llm = self._llm_for("threat_hunter")
return RequirementAgent(
llm=llm,
tools=[ThinkTool(), ...],
# ... system prompt, middleware
)

The Orchestrator

The orchestrator is a special agent that routes alerts to the correct specialist via HandoffTool:

def create_orchestrator(self, agents: dict) -> RequirementAgent:
llm = self._llm_for("orchestrator")
handoff_tools = [
HandoffTool(agent=agent, name=name)
for name, agent in agents.items()
]
return RequirementAgent(
llm=llm,
tools=[ThinkTool(), *handoff_tools],
# ...
)

The orchestrator gets its own LLM (potentially a fine-tuned orchestration model) and can delegate to any specialist agent.

Building the Full Team

async def build_full_team(self):
specialists = [
await self.create_agent(name) for name in AGENT_SPECS
]
orchestrator = await self.create_orchestrator(specialists)
return orchestrator, specialists

This builds all 16 specialist agents from AGENT_SPECS and wires them into the orchestrator.

Data Flow Diagram

Key Design Decisions

Why BeeAI Framework?

AuroraSOC uses IBM's BeeAI Agent Framework because it provides:

  • RequirementAgent — agents that can specify requirements before responding (e.g., "I need the PCAP file")
  • ConditionalRequirement — attach requirements conditionally (e.g., only request ThinkTool for complex analysis)
  • GlobalTrajectoryMiddleware — shared memory across agent turns
  • HandoffTool — zero-code agent-to-agent delegation
  • ChatModel.from_name() — provider-agnostic model resolution (supports Ollama, vLLM, OpenAI-compatible APIs)

Why Per-Agent Models?

Not all security domains need the same model:

  • Malware analysis needs deep code/binary understanding → specialist model
  • Compliance needs regulatory knowledge → specialist model
  • Report generation needs structured output → different prompt engineering

Per-agent models let each specialist agent use a model fine-tuned for its exact domain.

Why Granite 4 Hybrid?

IBM Granite 4 Hybrid uses a Mamba-Transformer architecture:

  • Mamba layers provide O(n) complexity for long sequences (better for log analysis)
  • Transformer layers provide strong attention for reasoning tasks
  • Hybrid combines both strengths at lower VRAM than a pure Transformer

Next Steps