Per-Agent Specialist Training

AuroraSOC supports training dedicated models for each security domain. Instead of one generic SOC model, you can train individual specialists that outperform the generic model in their specific domain.

Why Per-Agent Models?

A generic SOC model spreads its LoRA capacity across all 12+ security domains. A per-agent specialist concentrates all that capacity on a single domain — producing deeper, more accurate analysis.

Performance comparison

Model Type	Threat Hunting	Malware Analysis	Incident Response
Generic	Good	Good	Good
Specialist	★★★★★ (if threat_hunter)	★★★★★ (if malware_analyst)	★★★★★ (if incident_responder)

When to use per-agent models

Production deployments where accuracy matters for specific alert types
High-volume SOCs where agents process thousands of alerts in their domain
Compliance requirements that mandate domain-specific AI tuning
After validating with the generic model that the training pipeline works

Available Agent Profiles

The training configuration (training/configs/granite_soc_finetune.yaml) defines these agent profiles:

Profile Name	Domain	Dataset Filter	Focus
`security_analyst`	General SOC	`alert_triage`, `incident_analysis`	Alert analysis, IOC extraction, MITRE mapping
`threat_hunter`	Threat Hunting	`threat_hunting`	Hunting hypotheses, LOLBin detection, KQL queries
`malware_analyst`	Malware	`malware_analysis`	YARA rules, sandbox analysis, PE headers
`incident_responder`	IR	`incident_response`	NIST playbooks, containment, eradication
`network_security`	Network	`network_security`	Flow analysis, Suricata rules, DNS tunneling
`cps_security`	ICS/OT	`cps_security`	Modbus, DNP3, IEC 62443, Purdue model
`threat_intel`	CTI	`threat_intel`	APT tracking, STIX/TAXII, diamond model
`forensic_analyst`	Forensics	`forensics`	Memory/disk forensics, chain of custody
`orchestrator`	Routing	`orchestration`	Multi-agent coordination, delegation

How Domain Filtering Works

When you train with --agent <name>, the pipeline filters the training dataset to only include examples relevant to that domain:

The filtering uses the domain field in each JSONL training record. The TACTIC_TO_DOMAIN mapping in prepare_datasets.py assigns MITRE ATT&CK tactics to specific domains:

MITRE Tactic	Mapped Domain
Reconnaissance, Resource Development	`threat_intel`
Initial Access, Execution, Persistence	`malware_analysis`
Privilege Escalation, Defense Evasion	`threat_hunting`
Credential Access, Lateral Movement	`network_security`
Collection, Exfiltration, C2	`incident_response`
Impact	`incident_response`

Each agent profile also injects its own system prompt from AGENT_SYSTEM_PROMPTS, ensuring the model learns to respond in the persona of that specific analyst.

Training a Single Specialist

Using Make

# Train one agent specialist
make train-agent AGENT=threat_hunter

# Equivalent to:
python training/scripts/finetune_granite.py \
  --config training/configs/granite_soc_finetune.yaml \
  --agent threat_hunter

Using Docker

docker compose -f docker-compose.training.yml run \
  -e AGENT_NAME=threat_hunter \
  training-agent

What Happens During Per-Agent Training

Dataset filtering: load_soc_dataset() reads soc_train.jsonl and keeps only records where domain matches the agent profile's dataset_filter
System prompt injection: Each training example gets the agent's specific system prompt prepended
LoRA training: Standard Unsloth/SFT training on the filtered dataset
Output: Model saved to training/output/<agent_name>/ (e.g., training/output/threat_hunter/)

Agent-Specific Output Directories

training/output/
├── generic/                    # make train (no --agent flag)
│   ├── adapter_model.safetensors
│   └── unsloth.Q8_0.gguf
├── threat_hunter/              # make train-agent AGENT=threat_hunter
│   ├── adapter_model.safetensors
│   └── unsloth.Q8_0.gguf
├── malware_analyst/            # make train-agent AGENT=malware_analyst
│   ├── adapter_model.safetensors
│   └── unsloth.Q8_0.gguf
└── ...

Training All Specialists

The train_all_agents.py script trains all agent profiles sequentially:

# Train all 9 agent profiles + the generic model
python training/scripts/train_all_agents.py

This script:

Reads agent profiles from granite_soc_finetune.yaml
Trains each agent sequentially (not in parallel — GPU memory constraint)
Exports each model to LoRA + GGUF format
Logs results for each agent

Time estimate (T4 GPU):

~15-30 min per agent × 9 agents = ~3-5 hours total
On an A100: ~5-10 min per agent = ~1 hour total

Importing All Specialists to Ollama

After training, import all specialist GGUFs to Ollama:

# Import all trained agents to Ollama
python training/scripts/serve_model.py ollama-all \
  --output-dir training/output

# This creates Ollama models:
#   granite-soc-threat-hunter:latest
#   granite-soc-malware-analyst:latest
#   granite-soc-incident-responder:latest
#   ... etc.

Each model gets the correct system prompt baked into its Ollama Modelfile:

FROM ./training/output/threat_hunter/unsloth.Q8_0.gguf

SYSTEM """You are the AuroraSOC Threat Hunter. You proactively search for
hidden threats using hypothesis-driven hunting..."""

TEMPLATE """{{- if .System }}<|start_of_role|>system<|end_of_role|>
{{ .System }}<|end_of_text|>
{{- end }}
<|start_of_role|>user<|end_of_role|>
{{ .Prompt }}<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>
{{ .Response }}<|end_of_text|>"""

PARAMETER temperature 0.1
PARAMETER stop "<|end_of_text|>"
PARAMETER stop "<|start_of_role|>"

The AGENT_MODEL_MAP

The aurorasoc/granite/__init__.py module maps each agent name to its Ollama model tag:

AGENT_MODEL_MAP = {
    "security_analyst":      "granite-soc-security-analyst",
    "threat_hunter":         "granite-soc-threat-hunter",
    "malware_analyst":       "granite-soc-malware-analyst",
    "incident_responder":    "granite-soc-incident-responder",
    "network_security":      "granite-soc-network-security",
    "cps_security":          "granite-soc-cps-security",
    "threat_intel":          "granite-soc-threat-intel",
    "ueba_analyst":          "granite-soc-ueba-analyst",
    "forensic_analyst":      "granite-soc-forensic-analyst",
    "endpoint_security":     "granite-soc-endpoint-security",
    "web_security":          "granite-soc-web-security",
    "cloud_security":        "granite-soc-cloud-security",
    "compliance_analyst":    "granite-soc-compliance-analyst",
    "vulnerability_manager": "granite-soc-vulnerability-manager",
    "report_generator":      "granite-soc-report-generator",
    "orchestrator":          "granite-soc-orchestrator",
}

When GRANITE_USE_PER_AGENT_MODELS=true, the factory resolves each agent to its dedicated model. Agents without a trained specialist fall back to the generic model.

Enabling Per-Agent Models

# Enable per-agent model resolution
export GRANITE_USE_FINETUNED=true
export GRANITE_USE_PER_AGENT_MODELS=true

# Or add to .env:
GRANITE_USE_FINETUNED=true
GRANITE_USE_PER_AGENT_MODELS=true

# Or use Makefile shortcut:
make enable-finetuned

The 4-tier model resolution order:

Custom Agent Profiles

You can add your own agent profiles to granite_soc_finetune.yaml:

agent_profiles:
  # ... existing profiles ...

  my_custom_agent:
    system_prompt: |
      You are a specialized incident triage bot for &lt;Company X&gt;.
      You classify alerts into P1-P4 based on asset criticality.
    dataset_filter: "incident_triage"
    model_override: "unsloth/granite-4.0-h-tiny"
    output_dir: "training/output/my_custom_agent"

Then train:

python training/scripts/finetune_granite.py \
  --config training/configs/granite_soc_finetune.yaml \
  --agent my_custom_agent

For the model to be automatically resolved in production, add it to AGENT_MODEL_MAP in aurorasoc/granite/__init__.py.

Next Steps

Evaluation & Export — benchmark specialist vs. generic models
LLM Integration: Granite Module — deep dive into model resolution

Why Per-Agent Models?​

Performance comparison​

When to use per-agent models​

Available Agent Profiles​

How Domain Filtering Works​

Training a Single Specialist​

Using Make​

Using Docker​

What Happens During Per-Agent Training​

Agent-Specific Output Directories​

Training All Specialists​

Importing All Specialists to Ollama​

The AGENT_MODEL_MAP​

Enabling Per-Agent Models​

Custom Agent Profiles​

Next Steps​