LLM Integration Architecture

This page explains how IBM Granite 4 models are wired into AuroraSOC's multi-agent system — from the factory that creates agents, through the Granite module that resolves models, to the serving backends that run inference.

High-Level Architecture

Component Responsibilities

Component	File	Purpose
AuroraAgentFactory	`aurorasoc/agents/factory.py`	Creates all 17 BeeAI agents (orchestrator + 16 specialists) with correct LLMs, tools, and prompts
Granite Module	`aurorasoc/granite/__init__.py`	Model resolution, configuration, and ChatModel creation
Model Registry	`aurorasoc/granite/registry.py`	Health checks, model availability, warmup
Agent Prompts	`aurorasoc/agents/prompts.py`	System prompts for each agent persona
BeeAI Framework	`beeai-framework`	RequirementAgent, ChatModel, tools, middleware

The Request Flow

When an alert arrives in AuroraSOC, here's the complete flow from event to LLM response:

Step-by-step:

Event ingestion: An alert arrives via MQTT or NATS JetStream
Normalization: The event pipeline normalizes the alert into a standard format
Dispatch: The API layer calls the factory to handle the alert
Agent selection: The factory determines which specialist agent should handle this alert type
Model resolution: _llm_for(agent_name) calls the Granite module to resolve the correct model
ChatModel creation: The Granite module returns a BeeAI ChatModel instance configured for the correct backend
Agent creation: The factory creates a RequirementAgent with the model, tools, and system prompt
Inference: The agent uses the ChatModel to query Ollama or vLLM
Response: The agent returns structured analysis back to the API layer

The Factory: `AuroraAgentFactory`

The factory (aurorasoc/agents/factory.py) is the central point where agents get their LLMs. Specialist definitions are declared in AGENT_SPECS, then instantiated through a generic create_agent() path.

Factory Initialization

class AuroraAgentFactory:
    def __init__(self, model_name: str | None = None,
                 granite_config: GraniteModelConfig | None = None):
        self.granite_config = granite_config or get_default_granite_config()
        if model_name:
            self.granite_config.model_name = model_name

The factory accepts an optional model_name (for manual override) and an optional GraniteModelConfig. If neither is provided, it falls back to get_default_granite_config() which reads from environment variables.

The `_llm_for()` Method

This is the critical bridge between agents and models:

def _llm_for(self, agent_name: str) -> ChatModel:
    """Resolve the appropriate ChatModel for a given agent."""
    return create_granite_chat_model(
        config=self.granite_config,
        agent_name=agent_name,
    )

Every create_*() method calls _llm_for() to get its model:

def create_threat_hunter(self) -> RequirementAgent:
    llm = self._llm_for("threat_hunter")
    return RequirementAgent(
        llm=llm,
        tools=[ThinkTool(), ...],
        # ... system prompt, middleware
    )

The Orchestrator

The orchestrator is a special agent that routes alerts to the correct specialist via HandoffTool:

def create_orchestrator(self, agents: dict) -> RequirementAgent:
    llm = self._llm_for("orchestrator")
    handoff_tools = [
        HandoffTool(agent=agent, name=name)
        for name, agent in agents.items()
    ]
    return RequirementAgent(
        llm=llm,
        tools=[ThinkTool(), *handoff_tools],
        # ...
    )

The orchestrator gets its own LLM (potentially a fine-tuned orchestration model) and can delegate to any specialist agent.

Building the Full Team

async def build_full_team(self):
    specialists = [
        await self.create_agent(name) for name in AGENT_SPECS
    ]
    orchestrator = await self.create_orchestrator(specialists)
    return orchestrator, specialists

This builds all 16 specialist agents from AGENT_SPECS and wires them into the orchestrator.

Data Flow Diagram

Key Design Decisions

Why BeeAI Framework?

AuroraSOC uses IBM's BeeAI Agent Framework because it provides:

RequirementAgent — agents that can specify requirements before responding (e.g., "I need the PCAP file")
ConditionalRequirement — attach requirements conditionally (e.g., only request ThinkTool for complex analysis)
GlobalTrajectoryMiddleware — shared memory across agent turns
HandoffTool — zero-code agent-to-agent delegation
ChatModel.from_name() — provider-agnostic model resolution (supports Ollama, vLLM, OpenAI-compatible APIs)

Why Per-Agent Models?

Not all security domains need the same model:

Malware analysis needs deep code/binary understanding → specialist model
Compliance needs regulatory knowledge → specialist model
Report generation needs structured output → different prompt engineering

Per-agent models let each specialist agent use a model fine-tuned for its exact domain.

Why Granite 4 Hybrid?

IBM Granite 4 Hybrid uses a Mamba-Transformer architecture:

Mamba layers provide O(n) complexity for long sequences (better for log analysis)
Transformer layers provide strong attention for reasoning tasks
Hybrid combines both strengths at lower VRAM than a pure Transformer

Next Steps

Granite Module Deep Dive — full details on model resolution
Model Swap Guide — switch between base and fine-tuned models
Serving Backends — Ollama vs vLLM configuration

High-Level Architecture​

Component Responsibilities​

The Request Flow​

Step-by-step:​

The Factory: AuroraAgentFactory​

Factory Initialization​

The _llm_for() Method​

The Orchestrator​

Building the Full Team​

Data Flow Diagram​

Key Design Decisions​

Why BeeAI Framework?​

Why Per-Agent Models?​

Why Granite 4 Hybrid?​

Next Steps​