AI Agent Fleet Deployment
Use this page when you want AuroraSOC running with the orchestrator and the specialist agents, not only the seeded demo UI from Quick Start.
When To Use This Page
- You want real LLM-backed agent analysis instead of demo-only workflows.
- You want a repeatable local setup with validation after each major step.
- You want to choose explicitly between the Ollama and vLLM inference paths.
Deployment Paths
AuroraSOC currently exposes two practical local startup paths:
| Path | Best when | Runtime shape | Notes |
|---|---|---|---|
| Ollama local-first | You want the simplest local model serving path or do not want to depend on a GPU container stack | Shared data services in Compose, MCP servers + API + dashboard + agents on the host | Best for debugging and iterative local work |
| vLLM GPU stack | You want the closest thing to the full containerized deployment | API, dashboard, agents, and vLLM in Compose | Requires NVIDIA GPU support |
Both paths share the same AuroraSOC repository, .env file, and dashboard login flow.
Prerequisites
Before you begin, make sure you have:
- Python 3.12+
- Node.js 22+
- Git
- Either Podman 5+ or Docker 24+
- At least 16 GB RAM for the Ollama path
- At least 32 GB RAM plus an NVIDIA GPU for the vLLM path
Additional backend-specific requirements:
- Ollama path: Ollama installed locally
- vLLM path: NVIDIA drivers, container GPU runtime support, and a Hugging Face token for the first model pull
Step 1: Clone The Repository
git clone https://github.com/ahmeddwalid/AuroraSOC
cd AuroraSOC
Expected result: you are in the repository root and can see docker-compose.yml, Makefile, and the docs/ directory.
Step 2: Create And Check The Environment File
make env-init
make env-check
Start with these values in .env before you choose the backend branch:
SYSTEM_MODE=dummy
ENABLED_AGENTS=all
CORS_ORIGINS=http://localhost:3000,http://localhost:3001
For the Ollama path, also add:
LLM_BACKEND=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=granite4:8b
OLLAMA_ORCHESTRATOR_MODEL=granite4:dense
MCP_CLIENT_HOST=localhost
A2A_CLIENT_HOST=localhost
For the vLLM path, also add:
LLM_BACKEND=vllm
HF_TOKEN=hf_your_token_here
VLLM_HF_MODEL=ibm-granite/granite-4.1-8b
VLLM_MODEL=granite-soc-specialist
VLLM_ORCHESTRATOR_MODEL=granite-soc-specialist
SYSTEM_MODE=dry_run
Expected result: .env exists, contains strong secrets, and points the selected backend at the correct host.
Step 3: Choose The Backend Path
- Ollama Local-First
- vLLM GPU Stack
Step 3.1: Start Shared Data Services
- Podman
- Docker
podman compose -f docker-compose.dev.yml up -d
podman compose -f docker-compose.dev.yml ps
docker compose -f docker-compose.dev.yml up -d
docker compose -f docker-compose.dev.yml ps
Expected result: PostgreSQL, Redis, NATS, and Mosquitto are healthy on the local machine.
Step 3.2: Install Python And Dashboard Dependencies
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
alembic upgrade head
cd dashboard
npm install
cd ..
Expected result: the virtual environment is active, Python dependencies are installed, the database schema is current, and dashboard dependencies are present.
Step 3.3: Start Ollama And Pull The Granite Models
In terminal 1:
ollama serve
In terminal 2:
make ollama-pull-granite
Expected result: ollama list shows granite4:8b and granite4:dense.
Step 3.4: Start The MCP Domain Servers
In terminal 3:
mkdir -p .logs/mcp
while IFS=: read -r domain port; do
MCP_DOMAIN="$domain" MCP_PORT="$port" \
python -m aurorasoc.tools.mcp_launcher \
> ".logs/mcp/${domain}.log" 2>&1 &
done <<'EOF'
siem:8101
soar:8102
edr:8103
network:8104
malware:8105
threat_intel:8106
cps:8107
ueba:8108
forensics:8109
osint:8110
network_capture:8111
document:8112
malware_intel:8113
cloud_provider:8114
vuln_intel:8115
EOF
Expected result: .logs/mcp/ contains one log file per MCP domain server.
Step 3.5: Start The API And Dashboard
In terminal 4:
make dev-all
Expected result: the API is listening on port 8000 and the dashboard is listening on port 3000.
Step 3.6: Start The Specialist Agents And The Orchestrator
In terminal 5:
mkdir -p .logs/agents
while IFS=: read -r name port; do
AGENT_NAME="$name" AGENT_PORT="$port" \
python -m aurorasoc.agents.generic_server \
> ".logs/agents/${name}.log" 2>&1 &
done <<'EOF'
SecurityAnalyst:9001
ThreatHunter:9002
MalwareAnalyst:9003
IncidentResponder:9004
NetworkSecurity:9005
WebSecurity:9006
CloudSecurity:9007
CPSSecurity:9008
ThreatIntel:9009
EndpointBehavior:9010
ForensicAnalyst:9012
ReportGenerator:9015
NetworkAnalyzer:9016
EOF
python -m aurorasoc.agents.orchestrator.server
Expected result: the orchestrator starts in the foreground and the specialist logs appear under .logs/agents/.
Step 3.7: Validate The Local-First Agent Mesh
In a new terminal:
make prod-validate
curl -s http://localhost:8000/health | python -m json.tool
curl -s http://localhost:11434/api/tags | python -m json.tool
ss -ltn | grep -E ':9000|:9001|:9010|:9016'
tail -n 20 .logs/agents/SecurityAnalyst.log
make prod-validate confirms the current .env, PostgreSQL migration head,
and the selected LLM backend before you rely on the live mesh. Then open
http://localhost:3000, log in with admin / admin123!, and confirm that
the dashboard renders.
Step 3.8: Stop The Local-First Stack
pkill -f 'aurorasoc.agents.generic_server'
pkill -f 'aurorasoc.agents.orchestrator.server'
pkill -f 'aurorasoc.tools.mcp_launcher'
Use the same container runtime you used above to stop the shared services:
- Podman
- Docker
podman compose -f docker-compose.dev.yml down
docker compose -f docker-compose.dev.yml down
Step 3.1: Confirm GPU Access
nvidia-smi
Expected result: your NVIDIA GPU is visible on the host before you start containers.
Step 3.2: Start The Full Stack
- Podman
- Docker
podman compose \
-f docker-compose.yml \
-f docker-compose.gpu.yml \
--profile agents up -d
docker compose \
-f docker-compose.yml \
-f docker-compose.gpu.yml \
--profile agents up -d
Add --profile rust-core only if you want the optional Rust fast path.
Expected result: the API, dashboard, data services, agents, and vLLM containers are created.
Step 3.3: Run The Database Migrations
- Podman
- Docker
podman compose exec api alembic upgrade head
docker compose exec api alembic upgrade head
Expected result: the API container reports that the database is at head.
Step 3.4: Validate The Full Container Stack
- Podman
- Docker
podman compose ps
make prod-validate
curl -s http://localhost:8000/health | python -m json.tool
curl -s http://localhost:8001/v1/models | python -m json.tool
docker compose ps
make prod-validate
curl -s http://localhost:8000/health | python -m json.tool
curl -s http://localhost:8001/v1/models | python -m json.tool
Then open http://localhost:3000 and log in with admin / admin123!.
Step 3.5: Stop The Full Container Stack
- Podman
- Docker
podman compose -f docker-compose.yml -f docker-compose.gpu.yml down
docker compose -f docker-compose.yml -f docker-compose.gpu.yml down
Troubleshooting
make env-check fails
Cause: .env is missing required secrets or still contains placeholder values.
Fix: rerun make env-init, then review the file manually before starting services.
Ollama models are missing
Cause: the local model store does not contain granite4:8b or granite4:dense.
Fix: run make ollama-pull-granite again and confirm the models with ollama list.
Local agents cannot reach MCP servers
Cause: host-run agents still point at Compose service names instead of localhost.
Fix: make sure .env contains MCP_CLIENT_HOST=localhost before starting the local-first path.
The orchestrator cannot reach specialists
Cause: host-run A2A discovery is still using Compose service names.
Fix: make sure .env contains A2A_CLIENT_HOST=localhost before you start the orchestrator.
vLLM startup stalls on the first boot
Cause: the model image is still downloading or initializing.
Fix: follow the vLLM container logs and wait for the server to report that it is ready before testing the API.
You want deeper operational detail
Use the Agent Fleet Runbook for profile selection, health checks, logs, and host-run versus containerized troubleshooting.