Environment Variables Reference
Complete reference of runtime environment variables used by the current codebase.
Core Application
| Variable | Default | Description |
|---|---|---|
ENVIRONMENT | development | Runtime environment (development, staging, production) |
DEBUG | false | Enable debug mode (verbose logging, auto-reload) |
LOG_LEVEL | INFO | Logging level: DEBUG, INFO, WARNING, ERROR |
SYSTEM_MODE | real | Runtime guardrail mode (dummy, dry_run, real) |
LLM Provider
| Variable | Default | Description |
|---|---|---|
LLM_BACKEND | vllm | Inference backend (vllm or ollama) |
VLLM_MODEL | granite-soc-specialist | vLLM specialist model |
VLLM_ORCHESTRATOR_MODEL | granite-soc-specialist | vLLM orchestrator model |
VLLM_BASE_URL | http://vllm:8000/v1 | vLLM instance URL |
OLLAMA_MODEL | granite4:8b | Ollama specialist model |
OLLAMA_ORCHESTRATOR_MODEL | granite4:8b | Ollama orchestrator model. Keep this equal to OLLAMA_MODEL for single-model local deployments. |
OLLAMA_BASE_URL | http://ollama:11434 | Ollama instance URL |
Local Single-Model Controls
These variables keep all AuroraSOC agents on one local inference service and one model tag, which is the recommended MVP path for 8 GB VRAM laptops and other constrained single-machine deployments.
| Variable | Default | Description |
|---|---|---|
GRANITE_SINGLE_MODEL_MODE | true | Forces Ollama specialist and orchestrator agents to resolve to the same model tag, even if per-agent model routing or a separate orchestrator tag is configured. |
GRANITE_USE_SHARED_MODEL_POOL | true | Reuses process-local BeeAI ChatModel instances for identical backend/model/provider tuples. |
GRANITE_MAX_CONCURRENT_REQUESTS | 1 | Recommended local inference concurrency. Keep at 1 on 8 GB VRAM to avoid model thrash. |
GRANITE_REQUEST_QUEUE_SIZE | 16 | Planned bounded queue size for shared LLM request admission. |
GRANITE_INFERENCE_TIMEOUT_SECONDS | 180 | Planned per-request timeout budget for local LLM inference. |
For a quota-limited local install, inspect installed models first with ollama list. If a usable model is already present, point both OLLAMA_MODEL and OLLAMA_ORCHESTRATOR_MODEL at that tag instead of pulling another model.
PostgreSQL
| Variable | Default | Description |
|---|---|---|
PG_HOST | postgres | PostgreSQL hostname |
PG_PORT | 5432 | PostgreSQL port |
PG_USER | aurorasoc | Database user |
PG_PASSWORD | (required in compose) | Database password |
PG_DATABASE | aurorasoc | Database name |
Redis
| Variable | Default | Description |
|---|---|---|
REDIS_URL | redis://redis:6379 | Direct connection URL |
NATS JetStream
| Variable | Default | Description |
|---|---|---|
NATS_URL | nats://localhost:4222 | NATS server URL |
NATS_STREAM_NAME | AURORA | JetStream stream name |
MQTT
| Variable | Default | Description |
|---|---|---|
MQTT_HOST | mosquitto | MQTT broker hostname |
MQTT_PORT | 8883 | MQTT broker port (mTLS-first) |
MQTT_USERNAME | — | MQTT username (optional) |
MQTT_PASSWORD | — | MQTT password (optional) |
MQTT_TOPIC_PREFIX | aurora | MQTT topic prefix |
pgvector (PostgreSQL Extension)
Vector embedding storage is handled by pgvector inside the same PostgreSQL instance. No separate service is needed. Relevant pool and SSL settings live in the PostgreSQL section above.
| Variable | Default | Description |
|---|---|---|
PG_POOL_SIZE | 20 | SQLAlchemy connection pool size |
PG_MAX_OVERFLOW | 10 | Extra connections beyond pool size |
PG_SSLMODE | prefer | SSL mode (disable, prefer, require, verify-ca, verify-full) |
HashiCorp Vault
| Variable | Default | Description |
|---|---|---|
VAULT_ADDR | http://vault:8200 | Vault server URL |
VAULT_TOKEN | — | Vault access token |
VAULT_KV_MOUNT | secret | KV-v2 secrets mount path |
VAULT_PKI_MOUNT | pki_iot | PKI mount path |
Authentication
| Variable | Default | Description |
|---|---|---|
JWT_SECRET_KEY | — | Required. JWT signing secret (≥32 chars) |
JWT_EXPIRY_HOURS | 24 | Token lifetime in hours |
API_SERVICE_KEY | — | Required. static bootstrap API key for service auth |
Approval Policy
| Variable | Default | Description |
|---|---|---|
APPROVAL_EXPIRES_MINUTES | 30 | Default expiration window for human approval requests |
APPROVAL_WAIT_TIMEOUT_SECONDS | 300 | Default wait timeout for approval polling flows |
APPROVAL_POLL_INTERVAL_SECONDS | 5 | Poll interval for approval decision checks |
Observability
| Variable | Default | Description |
|---|---|---|
OTEL_EXPORTER_ENDPOINT | http://otel-collector:4317 | OTLP gRPC endpoint |
OTEL_SERVICE_NAME | aurorasoc | Service name in traces |
OTEL_PROMETHEUS_PORT | 9090 | Prometheus metrics port used by config |
A2A Resolution
Used to override agent service discovery statically:
| Variable | Default | Description |
|---|---|---|
A2A_DISCOVERY_MODE | compose | How to resolve agent IP (compose or k8s) |
A2A_CLIENT_HOST | — | Global override for all A2A target hostnames |
Agent Deployment
| Variable | Default | Description |
|---|---|---|
ENABLED_AGENTS | all | Comma-separated list of specialist agent names to instantiate at startup, or all to start the full fleet (e.g. SecurityAnalyst,ThreatHunter,IncidentResponder). The Orchestrator always runs regardless of this setting. |
GPU / vLLM Tuning
These variables are used by the vllm service in docker-compose.gpu.yml when running on consumer GPUs with limited VRAM.
| Variable | Default | Description |
|---|---|---|
VLLM_GPU_MEMORY_UTIL | 0.90 | Fraction of GPU VRAM reserved for vLLM's KV-cache (0.0–1.0). Reduce if OOM. |
VLLM_MAX_MODEL_LEN | 8192 | Maximum sequence length (prompt + completion tokens). Lower = less VRAM. |
CUDA_VISIBLE_DEVICES | (unset) | GPU index to expose to the vLLM container. Leave unset to use all GPUs. |
Check all loaded settings by starting the API with DEBUG=true:
DEBUG=true python -m aurorasoc.api.main
Settings will be printed at INFO level during startup.