Environment Variables Reference

Complete reference of runtime environment variables used by the current codebase.

Core Application

Variable	Default	Description
`ENVIRONMENT`	`development`	Runtime environment (`development`, `staging`, `production`)
`DEBUG`	`false`	Enable debug mode (verbose logging, auto-reload)
`LOG_LEVEL`	`INFO`	Logging level: DEBUG, INFO, WARNING, ERROR
`SYSTEM_MODE`	`real`	Runtime guardrail mode (`dummy`, `dry_run`, `real`)

LLM Provider

Variable	Default	Description
`LLM_BACKEND`	`vllm`	Inference backend (vllm or ollama)
`VLLM_MODEL`	`granite-soc-specialist`	vLLM specialist model
`VLLM_ORCHESTRATOR_MODEL`	`granite-soc-specialist`	vLLM orchestrator model
`VLLM_BASE_URL`	`http://vllm:8000/v1`	vLLM instance URL
`OLLAMA_MODEL`	`granite4:8b`	Ollama specialist model
`OLLAMA_ORCHESTRATOR_MODEL`	`granite4:8b`	Ollama orchestrator model. Keep this equal to `OLLAMA_MODEL` for single-model local deployments.
`OLLAMA_BASE_URL`	`http://ollama:11434`	Ollama instance URL

Local Single-Model Controls

These variables keep all AuroraSOC agents on one local inference service and one model tag, which is the recommended MVP path for 8 GB VRAM laptops and other constrained single-machine deployments.

Variable	Default	Description
`GRANITE_SINGLE_MODEL_MODE`	`true`	Forces Ollama specialist and orchestrator agents to resolve to the same model tag, even if per-agent model routing or a separate orchestrator tag is configured.
`GRANITE_USE_SHARED_MODEL_POOL`	`true`	Reuses process-local BeeAI `ChatModel` instances for identical backend/model/provider tuples.
`GRANITE_MAX_CONCURRENT_REQUESTS`	`1`	Recommended local inference concurrency. Keep at `1` on 8 GB VRAM to avoid model thrash.
`GRANITE_REQUEST_QUEUE_SIZE`	`16`	Planned bounded queue size for shared LLM request admission.
`GRANITE_INFERENCE_TIMEOUT_SECONDS`	`180`	Planned per-request timeout budget for local LLM inference.

For a quota-limited local install, inspect installed models first with ollama list. If a usable model is already present, point both OLLAMA_MODEL and OLLAMA_ORCHESTRATOR_MODEL at that tag instead of pulling another model.

PostgreSQL

Variable	Default	Description
`PG_HOST`	`postgres`	PostgreSQL hostname
`PG_PORT`	`5432`	PostgreSQL port
`PG_USER`	`aurorasoc`	Database user
`PG_PASSWORD`	`(required in compose)`	Database password
`PG_DATABASE`	`aurorasoc`	Database name

Redis

Variable	Default	Description
`REDIS_URL`	`redis://redis:6379`	Direct connection URL

NATS JetStream

Variable	Default	Description
`NATS_URL`	`nats://localhost:4222`	NATS server URL
`NATS_STREAM_NAME`	`AURORA`	JetStream stream name

MQTT

Variable	Default	Description
`MQTT_HOST`	`mosquitto`	MQTT broker hostname
`MQTT_PORT`	`8883`	MQTT broker port (mTLS-first)
`MQTT_USERNAME`	—	MQTT username (optional)
`MQTT_PASSWORD`	—	MQTT password (optional)
`MQTT_TOPIC_PREFIX`	`aurora`	MQTT topic prefix

pgvector (PostgreSQL Extension)

Vector embedding storage is handled by pgvector inside the same PostgreSQL instance. No separate service is needed. Relevant pool and SSL settings live in the PostgreSQL section above.

Variable	Default	Description
`PG_POOL_SIZE`	`20`	SQLAlchemy connection pool size
`PG_MAX_OVERFLOW`	`10`	Extra connections beyond pool size
`PG_SSLMODE`	`prefer`	SSL mode (`disable`, `prefer`, `require`, `verify-ca`, `verify-full`)

HashiCorp Vault

Variable	Default	Description
`VAULT_ADDR`	`http://vault:8200`	Vault server URL
`VAULT_TOKEN`	—	Vault access token
`VAULT_KV_MOUNT`	`secret`	KV-v2 secrets mount path
`VAULT_PKI_MOUNT`	`pki_iot`	PKI mount path

Authentication

Variable	Default	Description
`JWT_SECRET_KEY`	—	Required. JWT signing secret (≥32 chars)
`JWT_EXPIRY_HOURS`	`24`	Token lifetime in hours
`API_SERVICE_KEY`	—	Required. static bootstrap API key for service auth

Approval Policy

Variable	Default	Description
`APPROVAL_EXPIRES_MINUTES`	`30`	Default expiration window for human approval requests
`APPROVAL_WAIT_TIMEOUT_SECONDS`	`300`	Default wait timeout for approval polling flows
`APPROVAL_POLL_INTERVAL_SECONDS`	`5`	Poll interval for approval decision checks

Observability

Variable	Default	Description
`OTEL_EXPORTER_ENDPOINT`	`http://otel-collector:4317`	OTLP gRPC endpoint
`OTEL_SERVICE_NAME`	`aurorasoc`	Service name in traces
`OTEL_PROMETHEUS_PORT`	`9090`	Prometheus metrics port used by config

A2A Resolution

Used to override agent service discovery statically:

Variable	Default	Description
`A2A_DISCOVERY_MODE`	`compose`	How to resolve agent IP (`compose` or `k8s`)
`A2A_CLIENT_HOST`	—	Global override for all A2A target hostnames

Agent Deployment

Variable	Default	Description
`ENABLED_AGENTS`	`all`	Comma-separated list of specialist agent names to instantiate at startup, or `all` to start the full fleet (e.g. `SecurityAnalyst,ThreatHunter,IncidentResponder`). The Orchestrator always runs regardless of this setting.

GPU / vLLM Tuning

These variables are used by the vllm service in docker-compose.gpu.yml when running on consumer GPUs with limited VRAM.

Variable	Default	Description
`VLLM_GPU_MEMORY_UTIL`	`0.90`	Fraction of GPU VRAM reserved for vLLM's KV-cache (0.0–1.0). Reduce if OOM.
`VLLM_MAX_MODEL_LEN`	`8192`	Maximum sequence length (prompt + completion tokens). Lower = less VRAM.
`CUDA_VISIBLE_DEVICES`	(unset)	GPU index to expose to the vLLM container. Leave unset to use all GPUs.

Quick Validation

Check all loaded settings by starting the API with DEBUG=true:

DEBUG=true python -m aurorasoc.api.main

Settings will be printed at INFO level during startup.

Core Application​

LLM Provider​

Local Single-Model Controls​

PostgreSQL​

Redis​

NATS JetStream​

MQTT​

pgvector (PostgreSQL Extension)​

HashiCorp Vault​

Authentication​

Approval Policy​

Observability​

A2A Resolution​

Agent Deployment​

GPU / vLLM Tuning​