Skip to main content

Event-Driven Pipeline

AuroraSOC processes security events through a multi-stage, event-driven pipeline using three complementary messaging systems. This architecture ensures events are processed reliably, at scale, and with proper ordering.

Pipeline Overview

Default deployments use the Python API and MQTT consumer bridge to normalize and validate events before publishing them into Redis Streams. Enable --profile rust-core only when you want the optional Rust fast path to handle high-throughput ingest or attestation workloads.

Why Three Messaging Systems?

Each messaging system serves a distinct purpose:

SystemPurposePatternDurability
Redis StreamsInternal event busConsumer groupsIn-memory + AOF
NATS JetStreamCross-site federationPub/Sub + persistenceDisk-backed
MQTTIoT device communicationPub/Sub + QoS levelsBroker-dependent

Redis Streams — The Internal Bus

Redis Streams is the primary event bus within a single AuroraSOC deployment:

Why Redis Streams?

  • Consumer groups — Multiple consumers share the workload, each message processed once
  • Message acknowledgment — Failed messages can be re-processed
  • Stream trimming — Automatic cleanup of old messages (configurable max length)
  • Sub-millisecond latency — Critical for real-time alerting
  • Already deployed — Redis is used for caching, so no additional infrastructure

Agent Task Worker Architecture

AuroraSOC runs a dedicated worker process at aurorasoc/workers/agent_task_worker.py to execute queued tasks and publish correlated results.

Consumer groups used in production:

StreamConsumer GroupPrimary Consumer
aurora:agent:tasksaurora-agentsAgentTaskWorker
aurora:agent:resultsapi-agent-resultsFastAPI result correlator
aurora:alertsapi-ws-alertsWebSocket alert broadcaster
aurora:auditapi-ws-thoughtsWebSocket reasoning broadcaster

Stream Lag Metrics

Prometheus should track stream lag and throughput for the task/result path:

MetricDescriptionAlert Suggestion
aurora_agent_tasks_pending_totalPending messages in aurora:agent:tasks groupWarn if > 200 for 5 min
aurora_agent_task_retry_totalCount of worker requeued tasksWarn on sustained growth
aurora_agent_dead_letter_totalCount of dead-lettered tasksPage on any non-zero spike
aurora_agent_result_correlation_latency_msPublish→future-resolution latencyWarn if p95 > 5000 ms
aurora_agent_results_unmatched_totalResult events with no waiting futureInvestigate if steadily increasing

NATS JetStream — Cross-Site Federation

For organizations with multiple AuroraSOC instances:

The AURORA JetStream stream has subjects:

  • AURORA.alerts — Alert federation
  • AURORA.ioc_sharing — IOC dissemination
  • AURORA.agent_results — Investigation sharing
  • AURORA.audit — Cross-site audit trail

Why NATS JetStream?

  • Persistent delivery — Messages survive broker restarts
  • Exactly-once semantics — Via consumer acknowledgment + deduplication
  • Geographic distribution — NATS clusters span data centers
  • Lightweight — Single binary, minimal resource usage

MQTT — IoT Edge Communication

MQTT connects resource-constrained IoT devices:

Why MQTT?

  • Minimal overhead — 2-byte header, ideal for constrained devices (nRF52840 with 256KB RAM)
  • QoS levels — Guaranteed delivery even with intermittent connectivity
  • Topic wildcardsaurora/sensors/+/* captures all device telemetry
  • Industry standard — Every IoT platform speaks MQTT
  • TLS support — Encrypted device-to-broker communication

Event Processing Guarantees

GuaranteeImplementation
At-least-once deliveryConsumer group ack + re-delivery on timeout
OrderingPer-stream ordering via Redis Streams
DeduplicationSHA-256 hash-based dedup in scheduler (5-min window)
BackpressureConsumer groups naturally provide backpressure
Dead letterFailed tasks after 3 retries moved to aurora:agent:dead_letter
Design Principle

AuroraSOC follows the "smart endpoints, dumb pipes" philosophy. The messaging systems are simple transport — all business logic lives in the consuming services (API, agents, scheduler).

Event Flow Example: Alert Processing

Here's the complete journey of a security event:

When --profile rust-core is enabled, the Rust service can perform the ingest normalization step before publishing into the same Redis streams shown above.