AI Assistant Safety
The AuroraSOC AI Assistant (the chat in the operator console) is hardened against adversarial use so it stays a safe analyst aid rather than an attack surface.
What the assistant will and will not do
- It knows the current date and time, so it dates answers and generated reports correctly instead of guessing.
- It refuses to reveal or change its system instructions, refuses to change its role or identity, and will not output API keys, tokens, or other secrets, even if asked.
- When you paste logs, alerts, or tool output, any "instructions" hidden inside that text are treated as data to analyze, never as commands. This blocks indirect prompt injection from attacker-controlled content.
- It grounds answers in your live alerts and cases and cites real ids where relevant.
Protections in place
- Input sanitization and fencing. Pasted and database content is cleaned (control, zero-width, and bidi characters removed) and wrapped in non-forgeable untrusted-data fences (ADR 036).
- Output filtering. Responses are scanned and any secret-shaped strings or system-prompt leakage are redacted before they reach you.
- Rate limiting. The chat is rate limited per user to control cost and abuse. If you send too many messages too quickly you will see a "slow down" notice (HTTP 429).
- Auditability. Suspected injection attempts are logged for review.
Good practice
- Treat AI output as decision support, not ground truth. Verify before acting, and note that containment or destructive actions still require human approval.
- Do not paste credentials or secrets into the chat.
For the engineering detail of these controls, see the developer page on Security and autonomy hardening.