Skip to main content

LLM Integration Architecture

This page explains how IBM Granite 4 models are wired into AuroraSOC's multi-agent system — from the factory that creates agents, through the Granite module that resolves models, to the serving backends that run inference.

High-Level Architecture

Component Responsibilities

ComponentFilePurpose
AuroraAgentFactoryaurorasoc/agents/factory.pyCreates all 16 BeeAI agents with correct LLMs, tools and system prompts
Granite Moduleaurorasoc/granite/__init__.pyModel resolution, configuration, and ChatModel creation
Model Registryaurorasoc/granite/registry.pyHealth checks, model availability, warmup
Agent Promptsaurorasoc/agents/prompts.pySystem prompts for each agent persona
BeeAI Frameworkbeeai-frameworkRequirementAgent, ChatModel, tools, middleware

The Request Flow

When an alert arrives in AuroraSOC, here's the complete flow from event to LLM response:

Step-by-step:

  1. Event ingestion: An alert arrives via MQTT or NATS JetStream
  2. Normalization: The event pipeline normalizes the alert into a standard format
  3. Dispatch: The API layer calls the factory to handle the alert
  4. Agent selection: The factory determines which specialist agent should handle this alert type
  5. Model resolution: _llm_for(agent_name) calls the Granite module to resolve the correct model
  6. ChatModel creation: The Granite module returns a BeeAI ChatModel instance configured for the correct backend
  7. Agent creation: The factory creates a RequirementAgent with the model, tools, and system prompt
  8. Inference: The agent uses the ChatModel to query Ollama or vLLM
  9. Response: The agent returns structured analysis back to the API layer

The Factory: AuroraAgentFactory

The factory (aurorasoc/agents/factory.py) is the central point where agents get their LLMs. It has 15 create_*() methods (one per specialist) plus create_orchestrator() and build_full_team().

Factory Initialization

class AuroraAgentFactory:
def __init__(self, model_name: str | None = None,
granite_config: GraniteModelConfig | None = None):
self.granite_config = granite_config or get_default_granite_config()
if model_name:
self.granite_config.model_name = model_name

The factory accepts an optional model_name (for manual override) and an optional GraniteModelConfig. If neither is provided, it falls back to get_default_granite_config() which reads from environment variables.

The _llm_for() Method

This is the critical bridge between agents and models:

def _llm_for(self, agent_name: str) -> ChatModel:
"""Resolve the appropriate ChatModel for a given agent."""
return create_granite_chat_model(
config=self.granite_config,
agent_name=agent_name,
)

Every create_*() method calls _llm_for() to get its model:

def create_threat_hunter(self) -> RequirementAgent:
llm = self._llm_for("threat_hunter")
return RequirementAgent(
llm=llm,
tools=[ThinkTool(), ...],
# ... system prompt, middleware
)

The Orchestrator

The orchestrator is a special agent that routes alerts to the correct specialist via HandoffTool:

def create_orchestrator(self, agents: dict) -> RequirementAgent:
llm = self._llm_for("orchestrator")
handoff_tools = [
HandoffTool(agent=agent, name=name)
for name, agent in agents.items()
]
return RequirementAgent(
llm=llm,
tools=[ThinkTool(), *handoff_tools],
# ...
)

The orchestrator gets its own LLM (potentially a fine-tuned orchestration model) and can delegate to any specialist agent.

Building the Full Team

def build_full_team(self) -> RequirementAgent:
agents = {
"security_analyst": self.create_security_analyst(),
"threat_hunter": self.create_threat_hunter(),
"malware_analyst": self.create_malware_analyst(),
# ... 12 more agents
}
return self.create_orchestrator(agents)

This creates all 15 specialist agents (each with their own resolved LLM) and wires them into the orchestrator.

Data Flow Diagram

Key Design Decisions

Why BeeAI Framework?

AuroraSOC uses IBM's BeeAI Agent Framework because it provides:

  • RequirementAgent — agents that can specify requirements before responding (e.g., "I need the PCAP file")
  • ConditionalRequirement — attach requirements conditionally (e.g., only request ThinkTool for complex analysis)
  • GlobalTrajectoryMiddleware — shared memory across agent turns
  • HandoffTool — zero-code agent-to-agent delegation
  • ChatModel.from_name() — provider-agnostic model resolution (supports Ollama, vLLM, OpenAI-compatible APIs)

Why Per-Agent Models?

Not all security domains need the same model:

  • Malware analysis needs deep code/binary understanding → specialist model
  • Compliance needs regulatory knowledge → specialist model
  • Report generation needs structured output → different prompt engineering

Per-agent models let each specialist agent use a model fine-tuned for its exact domain.

Why Granite 4 Hybrid?

IBM Granite 4 Hybrid uses a Mamba-Transformer architecture:

  • Mamba layers provide O(n) complexity for long sequences (better for log analysis)
  • Transformer layers provide strong attention for reasoning tasks
  • Hybrid combines both strengths at lower VRAM than a pure Transformer

Next Steps