LLM Integration Architecture
This page explains how IBM Granite 4 models are wired into AuroraSOC's multi-agent system — from the factory that creates agents, through the Granite module that resolves models, to the serving backends that run inference.
High-Level Architecture
Component Responsibilities
| Component | File | Purpose |
|---|---|---|
| AuroraAgentFactory | aurorasoc/agents/factory.py | Creates all 16 BeeAI agents with correct LLMs, tools and system prompts |
| Granite Module | aurorasoc/granite/__init__.py | Model resolution, configuration, and ChatModel creation |
| Model Registry | aurorasoc/granite/registry.py | Health checks, model availability, warmup |
| Agent Prompts | aurorasoc/agents/prompts.py | System prompts for each agent persona |
| BeeAI Framework | beeai-framework | RequirementAgent, ChatModel, tools, middleware |
The Request Flow
When an alert arrives in AuroraSOC, here's the complete flow from event to LLM response:
Step-by-step:
- Event ingestion: An alert arrives via MQTT or NATS JetStream
- Normalization: The event pipeline normalizes the alert into a standard format
- Dispatch: The API layer calls the factory to handle the alert
- Agent selection: The factory determines which specialist agent should handle this alert type
- Model resolution:
_llm_for(agent_name)calls the Granite module to resolve the correct model - ChatModel creation: The Granite module returns a BeeAI
ChatModelinstance configured for the correct backend - Agent creation: The factory creates a
RequirementAgentwith the model, tools, and system prompt - Inference: The agent uses the ChatModel to query Ollama or vLLM
- Response: The agent returns structured analysis back to the API layer
The Factory: AuroraAgentFactory
The factory (aurorasoc/agents/factory.py) is the central point where agents get their LLMs. It has 15 create_*() methods (one per specialist) plus create_orchestrator() and build_full_team().
Factory Initialization
class AuroraAgentFactory:
def __init__(self, model_name: str | None = None,
granite_config: GraniteModelConfig | None = None):
self.granite_config = granite_config or get_default_granite_config()
if model_name:
self.granite_config.model_name = model_name
The factory accepts an optional model_name (for manual override) and an optional GraniteModelConfig. If neither is provided, it falls back to get_default_granite_config() which reads from environment variables.
The _llm_for() Method
This is the critical bridge between agents and models:
def _llm_for(self, agent_name: str) -> ChatModel:
"""Resolve the appropriate ChatModel for a given agent."""
return create_granite_chat_model(
config=self.granite_config,
agent_name=agent_name,
)
Every create_*() method calls _llm_for() to get its model:
def create_threat_hunter(self) -> RequirementAgent:
llm = self._llm_for("threat_hunter")
return RequirementAgent(
llm=llm,
tools=[ThinkTool(), ...],
# ... system prompt, middleware
)
The Orchestrator
The orchestrator is a special agent that routes alerts to the correct specialist via HandoffTool:
def create_orchestrator(self, agents: dict) -> RequirementAgent:
llm = self._llm_for("orchestrator")
handoff_tools = [
HandoffTool(agent=agent, name=name)
for name, agent in agents.items()
]
return RequirementAgent(
llm=llm,
tools=[ThinkTool(), *handoff_tools],
# ...
)
The orchestrator gets its own LLM (potentially a fine-tuned orchestration model) and can delegate to any specialist agent.
Building the Full Team
def build_full_team(self) -> RequirementAgent:
agents = {
"security_analyst": self.create_security_analyst(),
"threat_hunter": self.create_threat_hunter(),
"malware_analyst": self.create_malware_analyst(),
# ... 12 more agents
}
return self.create_orchestrator(agents)
This creates all 15 specialist agents (each with their own resolved LLM) and wires them into the orchestrator.
Data Flow Diagram
Key Design Decisions
Why BeeAI Framework?
AuroraSOC uses IBM's BeeAI Agent Framework because it provides:
- RequirementAgent — agents that can specify requirements before responding (e.g., "I need the PCAP file")
- ConditionalRequirement — attach requirements conditionally (e.g., only request ThinkTool for complex analysis)
- GlobalTrajectoryMiddleware — shared memory across agent turns
- HandoffTool — zero-code agent-to-agent delegation
- ChatModel.from_name() — provider-agnostic model resolution (supports Ollama, vLLM, OpenAI-compatible APIs)
Why Per-Agent Models?
Not all security domains need the same model:
- Malware analysis needs deep code/binary understanding → specialist model
- Compliance needs regulatory knowledge → specialist model
- Report generation needs structured output → different prompt engineering
Per-agent models let each specialist agent use a model fine-tuned for its exact domain.
Why Granite 4 Hybrid?
IBM Granite 4 Hybrid uses a Mamba-Transformer architecture:
- Mamba layers provide O(n) complexity for long sequences (better for log analysis)
- Transformer layers provide strong attention for reasoning tasks
- Hybrid combines both strengths at lower VRAM than a pure Transformer
Next Steps
- Granite Module Deep Dive — full details on model resolution
- Model Swap Guide — switch between base and fine-tuned models
- Serving Backends — Ollama vs vLLM configuration