Event-Driven Pipeline

AuroraSOC processes security events through a multi-stage, event-driven pipeline using three complementary messaging systems. This architecture ensures events are processed reliably, at scale, and with proper ordering.

Pipeline Overview

Default deployments use the Python API and MQTT consumer bridge to normalize and validate events before publishing them into Redis Streams. Enable --profile rust-core only when you want the optional Rust fast path to handle high-throughput ingest or attestation workloads.

Why Three Messaging Systems?

Each messaging system serves a distinct purpose:

System	Purpose	Pattern	Durability
Redis Streams	Internal event bus	Consumer groups	In-memory + AOF
NATS JetStream	Cross-site federation	Pub/Sub + persistence	Disk-backed
MQTT	IoT device communication	Pub/Sub + QoS levels	Broker-dependent

Redis Streams — The Internal Bus

Redis Streams is the primary event bus within a single AuroraSOC deployment:

Why Redis Streams?

Consumer groups — Multiple consumers share the workload, each message processed once
Message acknowledgment — Failed messages can be re-processed
Stream trimming — Automatic cleanup of old messages (configurable max length)
Sub-millisecond latency — Critical for real-time alerting
Already deployed — Redis is used for caching, so no additional infrastructure

Agent Task Worker Architecture

AuroraSOC runs a dedicated worker process at aurorasoc/workers/agent_task_worker.py to execute queued tasks and publish correlated results.

Consumer groups used in production:

Stream	Consumer Group	Primary Consumer
`aurora:agent:tasks`	`aurora-agents`	`AgentTaskWorker`
`aurora:agent:results`	`api-agent-results`	FastAPI result correlator
`aurora:alerts`	`api-ws-alerts`	WebSocket alert broadcaster
`aurora:audit`	`api-ws-thoughts`	WebSocket reasoning broadcaster

Stream Lag Metrics

Prometheus should track stream lag and throughput for the task/result path:

Metric	Description	Alert Suggestion
`aurora_agent_tasks_pending_total`	Pending messages in `aurora:agent:tasks` group	Warn if > 200 for 5 min
`aurora_agent_task_retry_total`	Count of worker requeued tasks	Warn on sustained growth
`aurora_agent_dead_letter_total`	Count of dead-lettered tasks	Page on any non-zero spike
`aurora_agent_result_correlation_latency_ms`	Publish→future-resolution latency	Warn if p95 > 5000 ms
`aurora_agent_results_unmatched_total`	Result events with no waiting future	Investigate if steadily increasing

NATS JetStream — Cross-Site Federation

For organizations with multiple AuroraSOC instances:

The AURORA JetStream stream has subjects:

AURORA.alerts — Alert federation
AURORA.ioc_sharing — IOC dissemination
AURORA.agent_results — Investigation sharing
AURORA.audit — Cross-site audit trail

Why NATS JetStream?

Persistent delivery — Messages survive broker restarts
Exactly-once semantics — Via consumer acknowledgment + deduplication
Geographic distribution — NATS clusters span data centers
Lightweight — Single binary, minimal resource usage

MQTT — IoT Edge Communication

MQTT connects resource-constrained IoT devices:

Why MQTT?

Minimal overhead — 2-byte header, ideal for constrained devices (nRF52840 with 256KB RAM)
QoS levels — Guaranteed delivery even with intermittent connectivity
Topic wildcards — aurora/sensors/+/* captures all device telemetry
Industry standard — Every IoT platform speaks MQTT
TLS support — Encrypted device-to-broker communication

Event Processing Guarantees

Guarantee	Implementation
At-least-once delivery	Consumer group ack + re-delivery on timeout
Ordering	Per-stream ordering via Redis Streams
Deduplication	SHA-256 hash-based dedup in scheduler (5-min window)
Backpressure	Consumer groups naturally provide backpressure
Dead letter	Failed tasks after 3 retries moved to `aurora:agent:dead_letter`

Design Principle

AuroraSOC follows the "smart endpoints, dumb pipes" philosophy. The messaging systems are simple transport — all business logic lives in the consuming services (API, agents, scheduler).

Event Flow Example: Alert Processing

Here's the complete journey of a security event:

When --profile rust-core is enabled, the Rust service can perform the ingest normalization step before publishing into the same Redis streams shown above.

Pipeline Overview​

Why Three Messaging Systems?​

Redis Streams — The Internal Bus​

Agent Task Worker Architecture​

Stream Lag Metrics​

NATS JetStream — Cross-Site Federation​

MQTT — IoT Edge Communication​

Event Processing Guarantees​

Event Flow Example: Alert Processing​