AI Assistant Safety

The AuroraSOC AI Assistant (the chat in the operator console) is hardened against adversarial use so it stays a safe analyst aid rather than an attack surface.

What the assistant will and will not do

It knows the current date and time, so it dates answers and generated reports correctly instead of guessing.
It refuses to reveal or change its system instructions, refuses to change its role or identity, and will not output API keys, tokens, or other secrets, even if asked.
When you paste logs, alerts, or tool output, any "instructions" hidden inside that text are treated as data to analyze, never as commands. This blocks indirect prompt injection from attacker-controlled content.
It grounds answers in your live alerts and cases and cites real ids where relevant.

Protections in place

Input sanitization and fencing. Pasted and database content is cleaned (control, zero-width, and bidi characters removed) and wrapped in non-forgeable untrusted-data fences (ADR 036).
Output filtering. Responses are scanned and any secret-shaped strings or system-prompt leakage are redacted before they reach you.
Rate limiting. The chat is rate limited per user to control cost and abuse. If you send too many messages too quickly you will see a "slow down" notice (HTTP 429).
Auditability. Suspected injection attempts are logged for review.

Good practice

Treat AI output as decision support, not ground truth. Verify before acting, and note that containment or destructive actions still require human approval.
Do not paste credentials or secrets into the chat.

For the engineering detail of these controls, see the developer page on Security and autonomy hardening.

What the assistant will and will not do​

Protections in place​

Good practice​

What the assistant will and will not do

Protections in place

Good practice