Model Swap Guide
AuroraSOC is designed for plug-and-play model swapping — you can switch between base, fine-tuned, per-agent, or even non-Granite models by changing environment variables. No code changes required.
Quick Reference
# Use base Granite 4 (default — no fine-tuning)
make disable-finetuned
# Use a single fine-tuned model for all agents
make enable-finetuned
# Use per-agent fine-tuned specialists
export GRANITE_USE_FINETUNED=true
export GRANITE_USE_PER_AGENT_MODELS=true
# Force a specific model for all agents (testing/debugging)
export GRANITE_MODEL_OVERRIDE=llama3.2:3b
Scenario 1: Base Model (Default)
When: First setup, before any fine-tuning, or for development/testing.
# .env
GRANITE_MODEL_NAME=granite3.2:2b
GRANITE_USE_FINETUNED=false
GRANITE_USE_PER_AGENT_MODELS=false
GRANITE_SERVING_BACKEND=ollama
What happens: All 16 agents use granite3.2:2b from Ollama. The model has general language capability but no AuroraSOC-specific security training.
Scenario 2: Single Fine-Tuned Model
When: You've trained a generic SOC model (via make train) and want all agents to use it.
# .env
GRANITE_USE_FINETUNED=true
GRANITE_USE_PER_AGENT_MODELS=false
GRANITE_FINETUNED_MODEL_TAG=granite-soc:latest
Or use the Makefile shortcut:
make enable-finetuned
What happens: All agents share the same fine-tuned model. This model has been trained on SOC data across all domains.
Scenario 3: Per-Agent Specialists
When: You've trained individual specialist models (via make train-all-agents) and want each agent to use its domain-specific model.
# .env
GRANITE_USE_FINETUNED=true
GRANITE_USE_PER_AGENT_MODELS=true
GRANITE_FINETUNED_MODEL_TAG=granite-soc:latest
What happens: Each agent resolves to its specialist model. Agents without a trained specialist fall back to the generic fine-tuned model, then to the base model.
Scenario 4: Override for Testing
When: You want to test a completely different model (e.g., Llama, Mistral) across all agents.
# .env
GRANITE_MODEL_OVERRIDE=llama3.2:3b
What happens: The override bypasses ALL resolution logic. Every agent uses the specified model regardless of fine-tuning settings.
Important: The override takes precedence over everything. Even if GRANITE_USE_PER_AGENT_MODELS=true, the override wins.
Scenario 5: vLLM for Production
When: Deploying to production where you need high throughput and multiple concurrent requests.
# .env
GRANITE_SERVING_BACKEND=vllm
VLLM_API_BASE=http://vllm-server:8000
GRANITE_USE_FINETUNED=true
What happens: Model resolution works the same way, but ChatModel.from_name() uses the OpenAI-compatible API (pointing at vLLM) instead of Ollama.
Switching Between Scenarios
From Base → Fine-Tuned
# 1. Train the model
make train-data
make train
# 2. Import to Ollama
make train-serve-ollama
# 3. Enable fine-tuned
make enable-finetuned
# 4. Restart services
docker compose restart
From Fine-Tuned → Per-Agent
# 1. Train all specialists
make train-all-agents
# 2. Import all to Ollama
python training/scripts/serve_model.py ollama-all --output-dir training/output
# 3. Enable per-agent
echo "GRANITE_USE_PER_AGENT_MODELS=true" >> .env
# 4. Restart
docker compose restart
From Per-Agent → Back to Base
make disable-finetuned
docker compose restart
A/B Testing Models
You can run two instances of AuroraSOC with different model configurations:
# Instance A: Base model (port 8001)
GRANITE_USE_FINETUNED=false PORT=8001 docker compose up
# Instance B: Fine-tuned (port 8002)
GRANITE_USE_FINETUNED=true PORT=8002 docker compose up
Send the same alerts to both instances and compare:
- Response quality
- Latency
- MITRE mapping accuracy
- False positive rates
Using Non-Granite Models
AuroraSOC's architecture supports any model accessible via Ollama or an OpenAI-compatible API:
Via Ollama
# Pull any Ollama model
ollama pull llama3.2:3b
ollama pull mistral:7b
ollama pull qwen2.5:7b
# Use it in AuroraSOC
export GRANITE_MODEL_OVERRIDE=llama3.2:3b
Via OpenAI-Compatible API
# Point to any OpenAI-compatible endpoint
export GRANITE_SERVING_BACKEND=vllm
export VLLM_API_BASE=https://api.example.com
export GRANITE_MODEL_OVERRIDE=my-custom-model
Via Cloud Providers
# OpenAI
export GRANITE_SERVING_BACKEND=vllm
export VLLM_API_BASE=https://api.openai.com
export GRANITE_MODEL_OVERRIDE=gpt-4o-mini
# IBM watsonx
export GRANITE_SERVING_BACKEND=vllm
export VLLM_API_BASE=https://us-south.ml.cloud.ibm.com/ml/v1
export GRANITE_MODEL_OVERRIDE=ibm/granite-3-8b-instruct
Cloud API providers require appropriate authentication (API keys, tokens). Set the required auth headers in your environment. The fine-tuned per-agent specialist models are local only — they can't be used with cloud providers unless you upload them.
Verifying the Active Model
Check What Each Agent Resolves To
You can verify model resolution programmatically:
from aurorasoc.granite import get_default_granite_config
config = get_default_granite_config()
# Check resolution for each agent
agents = ["security_analyst", "threat_hunter", "malware_analyst",
"incident_responder", "orchestrator"]
for agent in agents:
model = config.resolve_model(agent)
print(f"{agent:25s} → {model}")
Expected output with per-agent models enabled:
security_analyst → granite-soc-security-analyst:latest
threat_hunter → granite-soc-threat-hunter:latest
malware_analyst → granite-soc-malware-analyst:latest
incident_responder → granite-soc-incident-responder:latest
orchestrator → granite-soc-orchestrator:latest
Check Ollama Models
# List all available models
ollama list
# Verify a specific model works
ollama run granite-soc:latest "Classify this alert: ET TROJAN Cobalt Strike Beacon"
Troubleshooting
| Problem | Cause | Solution |
|---|---|---|
Agent uses base model despite USE_FINETUNED=true | Fine-tuned model not in Ollama | Run make train-serve-ollama to import |
All agents use same model despite PER_AGENT=true | Agent model not in AGENT_MODEL_MAP or not imported | Check ollama list, import missing models |
ChatModel.from_name() fails | Ollama not running or wrong host | Verify OLLAMA_HOST, run ollama serve |
| Model responds poorly | Using GGUF with excessive quantization | Re-export with q8_0 instead of q4_k_m |
| Override not taking effect | Env var not propagated | Restart the service, check docker compose config |
Next Steps
- Serving Backends — Ollama vs vLLM in detail
- Local Deployment — complete local setup walkthrough