Skip to main content

Model Swap Guide

AuroraSOC is designed for plug-and-play model swapping — you can switch between base, fine-tuned, per-agent, or even non-Granite models by changing environment variables. No code changes required.

Quick Reference

# Use base Granite 4 (default — no fine-tuning)
make disable-finetuned

# Use a single fine-tuned model for all agents
make enable-finetuned

# Use per-agent fine-tuned specialists
export GRANITE_USE_FINETUNED=true
export GRANITE_USE_PER_AGENT_MODELS=true

# Force a specific model for all agents (testing/debugging)
export GRANITE_MODEL_OVERRIDE=llama3.2:3b

Scenario 1: Base Model (Default)

When: First setup, before any fine-tuning, or for development/testing.

# .env
GRANITE_MODEL_NAME=granite3.2:2b
GRANITE_USE_FINETUNED=false
GRANITE_USE_PER_AGENT_MODELS=false
GRANITE_SERVING_BACKEND=ollama

What happens: All 16 agents use granite3.2:2b from Ollama. The model has general language capability but no AuroraSOC-specific security training.

Scenario 2: Single Fine-Tuned Model

When: You've trained a generic SOC model (via make train) and want all agents to use it.

# .env
GRANITE_USE_FINETUNED=true
GRANITE_USE_PER_AGENT_MODELS=false
GRANITE_FINETUNED_MODEL_TAG=granite-soc:latest

Or use the Makefile shortcut:

make enable-finetuned

What happens: All agents share the same fine-tuned model. This model has been trained on SOC data across all domains.

Scenario 3: Per-Agent Specialists

When: You've trained individual specialist models (via make train-all-agents) and want each agent to use its domain-specific model.

# .env
GRANITE_USE_FINETUNED=true
GRANITE_USE_PER_AGENT_MODELS=true
GRANITE_FINETUNED_MODEL_TAG=granite-soc:latest

What happens: Each agent resolves to its specialist model. Agents without a trained specialist fall back to the generic fine-tuned model, then to the base model.

Scenario 4: Override for Testing

When: You want to test a completely different model (e.g., Llama, Mistral) across all agents.

# .env
GRANITE_MODEL_OVERRIDE=llama3.2:3b

What happens: The override bypasses ALL resolution logic. Every agent uses the specified model regardless of fine-tuning settings.

Important: The override takes precedence over everything. Even if GRANITE_USE_PER_AGENT_MODELS=true, the override wins.

Scenario 5: vLLM for Production

When: Deploying to production where you need high throughput and multiple concurrent requests.

# .env
GRANITE_SERVING_BACKEND=vllm
VLLM_API_BASE=http://vllm-server:8000
GRANITE_USE_FINETUNED=true

What happens: Model resolution works the same way, but ChatModel.from_name() uses the OpenAI-compatible API (pointing at vLLM) instead of Ollama.

Switching Between Scenarios

From Base → Fine-Tuned

# 1. Train the model
make train-data
make train

# 2. Import to Ollama
make train-serve-ollama

# 3. Enable fine-tuned
make enable-finetuned

# 4. Restart services
docker compose restart

From Fine-Tuned → Per-Agent

# 1. Train all specialists
make train-all-agents

# 2. Import all to Ollama
python training/scripts/serve_model.py ollama-all --output-dir training/output

# 3. Enable per-agent
echo "GRANITE_USE_PER_AGENT_MODELS=true" >> .env

# 4. Restart
docker compose restart

From Per-Agent → Back to Base

make disable-finetuned
docker compose restart

A/B Testing Models

You can run two instances of AuroraSOC with different model configurations:

# Instance A: Base model (port 8001)
GRANITE_USE_FINETUNED=false PORT=8001 docker compose up

# Instance B: Fine-tuned (port 8002)
GRANITE_USE_FINETUNED=true PORT=8002 docker compose up

Send the same alerts to both instances and compare:

  • Response quality
  • Latency
  • MITRE mapping accuracy
  • False positive rates

Using Non-Granite Models

AuroraSOC's architecture supports any model accessible via Ollama or an OpenAI-compatible API:

Via Ollama

# Pull any Ollama model
ollama pull llama3.2:3b
ollama pull mistral:7b
ollama pull qwen2.5:7b

# Use it in AuroraSOC
export GRANITE_MODEL_OVERRIDE=llama3.2:3b

Via OpenAI-Compatible API

# Point to any OpenAI-compatible endpoint
export GRANITE_SERVING_BACKEND=vllm
export VLLM_API_BASE=https://api.example.com
export GRANITE_MODEL_OVERRIDE=my-custom-model

Via Cloud Providers

# OpenAI
export GRANITE_SERVING_BACKEND=vllm
export VLLM_API_BASE=https://api.openai.com
export GRANITE_MODEL_OVERRIDE=gpt-4o-mini

# IBM watsonx
export GRANITE_SERVING_BACKEND=vllm
export VLLM_API_BASE=https://us-south.ml.cloud.ibm.com/ml/v1
export GRANITE_MODEL_OVERRIDE=ibm/granite-3-8b-instruct
caution

Cloud API providers require appropriate authentication (API keys, tokens). Set the required auth headers in your environment. The fine-tuned per-agent specialist models are local only — they can't be used with cloud providers unless you upload them.

Verifying the Active Model

Check What Each Agent Resolves To

You can verify model resolution programmatically:

from aurorasoc.granite import get_default_granite_config

config = get_default_granite_config()

# Check resolution for each agent
agents = ["security_analyst", "threat_hunter", "malware_analyst",
"incident_responder", "orchestrator"]

for agent in agents:
model = config.resolve_model(agent)
print(f"{agent:25s}{model}")

Expected output with per-agent models enabled:

security_analyst          → granite-soc-security-analyst:latest
threat_hunter → granite-soc-threat-hunter:latest
malware_analyst → granite-soc-malware-analyst:latest
incident_responder → granite-soc-incident-responder:latest
orchestrator → granite-soc-orchestrator:latest

Check Ollama Models

# List all available models
ollama list

# Verify a specific model works
ollama run granite-soc:latest "Classify this alert: ET TROJAN Cobalt Strike Beacon"

Troubleshooting

ProblemCauseSolution
Agent uses base model despite USE_FINETUNED=trueFine-tuned model not in OllamaRun make train-serve-ollama to import
All agents use same model despite PER_AGENT=trueAgent model not in AGENT_MODEL_MAP or not importedCheck ollama list, import missing models
ChatModel.from_name() failsOllama not running or wrong hostVerify OLLAMA_HOST, run ollama serve
Model responds poorlyUsing GGUF with excessive quantizationRe-export with q8_0 instead of q4_k_m
Override not taking effectEnv var not propagatedRestart the service, check docker compose config

Next Steps