انتقل إلى المحتوى الرئيسي

Training IBM Granite Models for AuroraSOC

AuroraSOC uses IBM Granite 4.0 language models as the reasoning backbone for all 16 AI agents. While the base Granite models provide strong general-purpose capabilities, fine-tuning them on Security Operations Center (SOC) data produces dramatically better results for security-specific tasks like alert triage, threat hunting, incident response, and malware analysis.

This guide covers every step of the training pipeline — from preparing datasets to deploying fine-tuned models into production.

Why Fine-Tune?

Base language models are trained on broad internet text. They know about cybersecurity, but they don't think like a SOC analyst. Fine-tuning addresses this gap:

CapabilityBase Granite 4Fine-Tuned Granite 4
Alert triage accuracyGeneral heuristicsSOC-specific severity scoring
MITRE ATT&CK mappingKnows the frameworkAutomatically maps techniques to alerts
Incident response plansGeneric adviceNIST SP 800-61 / SANS-structured playbooks
ICS/OT protocol analysisMinimal knowledgeModbus, DNP3, OPC-UA, IEC 62443
Output formatFree-form textStructured JSON for downstream automation
Response timeComparableFaster (smaller, quantized models)

Key insight: Fine-tuning teaches the model your SOC's language — the specific alert formats, triage procedures, escalation criteria, and reporting templates that your team uses every day.

What Gets Trained

AuroraSOC's training pipeline supports two modes:

A single model trained on data from all SOC domains. Every agent uses the same fine-tuned weights. This is simpler to manage and works well when you have limited training data.

One model → All 16 agents

2. Per-Agent Specialist Models

Individual models fine-tuned for each agent's specific domain. The SecurityAnalyst agent gets a model trained primarily on alert analysis data, the ThreatHunter gets one focused on hunting hypotheses, etc. This produces better results but requires more compute and management overhead.

9 specialist models → 9 agent types
security_analyst → SecurityAnalyst, EndpointSecurity, WebSecurity, CloudSecurity
threat_hunter → ThreatHunter, UEBAAnalyst
malware_analyst → MalwareAnalyst
incident_responder → IncidentResponder
network_security → NetworkSecurity
cps_security → CPSSecurity
threat_intel → ThreatIntel
forensic_analyst → ForensicAnalyst
orchestrator → Orchestrator

When to Train

Use this decision tree to determine your approach:

Decision Criteria

ScenarioRecommendationWhy
First deployment, testingBase Granite 4 via OllamaZero training needed, get running fast
Production with standard SOC dataGeneric fine-tuned modelBest effort-to-quality ratio
Production with large domain-specific datasetsPer-agent specialistsMaximum accuracy per domain
Air-gapped environmentLocal GPU trainingNo internet required after initial setup
No GPU accessGoogle Colab (free T4)Free 16GB GPU, exports GGUF for local use
Team-wide model iterationDocker training pipelineReproducible, versioned, CI/CD friendly

Architecture Overview

The training pipeline consists of five stages:

StageScript / ToolOutput
1. Prepare Datatraining/scripts/prepare_datasets.pytraining/data/soc_train.jsonl
2. Fine-Tunetraining/scripts/finetune_granite.py or Colab notebookLoRA adapters in training/checkpoints/
3. Evaluatetraining/scripts/evaluate_model.pyBenchmark scores (JSON report)
4. ExportBuilt into finetune scriptGGUF (Ollama) or FP16 (vLLM)
5. Deploytraining/scripts/serve_model.py or scripts/setup_local.shRunning model in Ollama or vLLM

IBM Granite 4.0 Model Variants

AuroraSOC supports four Granite 4.0 model variants. Choose based on your available hardware:

ModelArchitectureParametersMin VRAM (4-bit)Best For
unsloth/granite-4.0-microDense Transformer~1B4 GBQuick testing, minimal hardware
unsloth/granite-4.0-h-microHybrid Mamba-Transformer~1B6 GBFast inference, edge deployment
unsloth/granite-4.0-h-tinyHybrid Mamba-Transformer~2B8 GBRecommended default (T4/RTX 3060+)
unsloth/granite-4.0-h-smallHybrid Mamba-Transformer~4B16 GBBest quality (A100/L4/RTX 4090)

Why Hybrid Mamba-Transformer? The "h" variants use IBM's hybrid architecture that combines Mamba state-space layers with standard Transformer attention. This gives faster inference (especially for long sequences) while maintaining strong reasoning ability. The Mamba layers handle pattern recognition across long security logs, while the Transformer layers handle complex multi-step reasoning.

Training Framework: Unsloth

All training uses Unsloth, which provides:

  • 2× faster training compared to standard Hugging Face SFTTrainer
  • 70% less VRAM via intelligent gradient checkpointing
  • Native Granite 4 support including hybrid Mamba-Transformer LoRA targets
  • Built-in GGUF export for direct Ollama deployment
  • Response masking via train_on_responses_only — the model only learns from assistant outputs, not user prompts

What is LoRA?

LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning technique. Instead of updating all model parameters (billions of weights), LoRA injects small trainable matrices into specific layers. This means:

  • Training uses much less VRAM (a 2B model fits on a 16GB T4)
  • Training is much faster (minutes to hours, not days)
  • The base model is not modified — you can swap LoRA adapters without re-downloading
  • You can have multiple LoRA adapters for different agents, all sharing the same base

AuroraSOC's LoRA configuration targets these layers (configured in training/configs/granite_soc_finetune.yaml):

lora:
r: 64 # Rank — higher = more capacity, more VRAM
lora_alpha: 64 # Scaling factor (typically = r)
target_modules: # Which layers to adapt
- q_proj # Query projection (attention)
- k_proj # Key projection (attention)
- v_proj # Value projection (attention)
- o_proj # Output projection (attention)
- gate_proj # Feed-forward gate
- up_proj # Feed-forward up projection
- down_proj # Feed-forward down projection
- shared_mlp.input_linear # Granite 4 Hybrid Mamba layers
- shared_mlp.output_linear # Granite 4 Hybrid Mamba layers

The shared_mlp.input_linear and shared_mlp.output_linear targets are specific to Granite 4 Hybrid models — they adapt the Mamba state-space layers in addition to the standard Transformer attention layers.

Chat Template

Granite 4 uses a specific chat template format. All training data must follow this format:

<|start_of_role|>system<|end_of_role|>
You are the AuroraSOC Security Analyst...<|end_of_text|>
<|start_of_role|>user<|end_of_role|>
Analyze this Suricata alert: ...<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>
Based on the alert, I identify the following...<|end_of_text|>

The training pipeline uses train_on_responses_only from Unsloth to mask the loss on system and user tokens. This means the model only learns to generate assistant responses — it doesn't memorize the prompts. This is critical for security: the model learns how to respond, not how to parrot training data.

Quick Reference: Make Targets

All training operations have Makefile shortcuts:

# Data preparation
make train-data # Download and prepare all SOC datasets

# Training (local GPU)
make train # Train generic SOC model
make train AGENT=security_analyst # Train specific agent model
make train-all # Train all 9 agent profiles

# Training (Docker)
make train-docker # Train in Docker container
make train-agent-docker AGENT=threat_hunter # Docker per-agent

# Evaluation
make train-eval # Evaluate local checkpoint
make train-eval-ollama # Evaluate Ollama model

# Export & Serve
make train-serve-ollama # Import GGUF into Ollama
make train-serve-vllm # Start vLLM server
make train-modelfile # Generate Ollama Modelfile

Next Steps

  1. Prerequisites — System requirements and setup
  2. Dataset Preparation — Building training data
  3. Local GPU Training — Train on your own hardware
  4. Docker Training — Reproducible containerized training
  5. Google Colab Training — Free cloud GPU training
  6. Per-Agent Specialists — Domain-specific models
  7. Evaluation & Export — Test and deploy models
  8. Configuration Reference — Full YAML config docs