إنتقل إلى المحتوى الرئيسي

Inter-site federation mesh

Most real organizations are not one building. AuroraSOC runs a complete, autonomous SOC at every site and connects the sites in an encrypted mesh, so each location keeps protecting itself even when its link to the rest of the organization goes down.

Inter-site mesh topology

How the mesh works

Sites attach to each other at the messaging layer: each site's NATS server connects to the primary site as a leafnode, so any event published at one site reaches every peer while each site keeps its own local stores. There is no central broker to lose.

Two kinds of traffic cross the mesh:

  • Health gossip. Every site publishes a heartbeat (identity, tier, agent counts) on a fixed interval. Peers use the heartbeat age to derive live link state - healthy, degraded, or down - which is exactly what the SOC Site Topology map on the Agent Fleet page renders.
  • Federated alerts. Alerts at or above the configured severity floor (high by default) are stamped with their origin site and shared with every peer. They arrive in the receiving site's alert queue with a purple origin badge (for example site-b), already deduplicated against replays.

Partition tolerance

A site that loses its mesh link keeps detecting, investigating, and responding with its local policy - nothing about the local loop depends on the mesh. The topology map marks the link down and the peer site offline; when connectivity returns, gossip resumes and the map heals on its own.

Enabling federation

Federation is off by default. On each site set:

FEDERATION_ENABLED=true
FEDERATION_SITE_ID=site-b # unique per site
FEDERATION_SITE_NAME="Branch SOC - Cairo"
FEDERATION_SITE_TIER=secondary
FEDERATION_FEDERATE_MIN_SEVERITY=high # what is shared with peers

For the development two-site demo, just stack-up-site-b brings up a complete second site (its own Postgres, Redis, and a NATS leafnode) beside the primary stack; just migrate-site-b prepares its database. See ADR 029 for the design and its deliberate limits (policy distribution and cross-site case handoff remain with the future federation controller).