Skip to main content

SOAR playbook authoring

AuroraSOC SOAR playbooks are YAML documents that define automated incident response workflows. Each playbook declares a trigger condition, an ordered sequence of steps, guardrail tiers for each step, and rollback actions so the orchestrator can reverse course when an alert is a false positive.

YAML schema reference

Top-level fields

FieldTypeRequiredDescription
namestringyesUnique playbook identifier (kebab-case)
descriptionstring (block scalar)yesHuman-readable description of the trigger and intent
severity_filterstringyesMinimum alert severity that triggers this playbook. One of: low, medium, high, critical
enabledbooleanyesWhether the playbook is active
requires_approvalbooleanyesIf true, the entire playbook requires approval before execution begins. Individual steps can also gate on HITL independently
mitre_techniqueslist of stringsnoMITRE ATT&CK technique IDs relevant to the playbook
stepslist of step objectsyesOrdered execution steps

Step structure

FieldTypeRequiredDescription
namestringyesUnique step name within the playbook (snake_case). Referenced by condition fields in downstream steps
actionstringyesAction verb dispatched to the MCP server (e.g. collect_evidence, analyze_flows, enrich_ioc, isolate_host, rotate_credentials, notify)
action_classstringyesGuardrail tier. See action classes table below
requires_approvalbooleannoIf true, this step renders an inline HITL approval card. Default: false
parametersobjectyesAction-specific parameters. Supports Go-template interpolation with field_path references
timeoutintegeryesMaximum execution time in seconds before the step is marked failed
conditionstringnoExecution guard. Step runs only when the condition evaluates true. Common pattern: previous.STEP_NAME.status == 'completed'
rollback_actionstringnoRequired when action_class is actuate, destructive, or write_external. The action verb to undo this step
rollback_paramsobjectnoParameters for the rollback action. Must include an action field specifying the undo operation
auditobjectnoAudit log entry for the rollback. Must include a description field

Field path interpolation

field_path parameters use dot notation to traverse the alert, event, and case object tree. Bracket notation provides indexed and filtered access into arrays.

Examples:

  • alert.subject.user_name -- scalar field on the alert
  • alert.affected_assets[0] -- first element of a list
  • alert.iocs[type=ipv4][0] -- filter by attribute, then index
  • previous.STEP_NAME.result.FIELD -- output of an earlier step
  • case.id -- the current case identifier

Action classes

ClassTierDescriptionRequires rollbackExample actions
readL0Read-only data collection. Never blocks on approvalNocollect_evidence, analyze_flows, enrich_ioc, get_asset_details, lookup_user, list_user_logins, extract_ioc, detect_dns_tunneling, analyze_behavior
write_internalL0Writes to internal AuroraSOC storage (case timeline, evidence store). Does not affect external systemsNoupdate_case, collect_evidence, capture_traffic
write_externalL1Sends data to an external system (Slack, PagerDuty, email). Observable outside AuroraSOCYesnotify
actuateL2Makes a reversible change to an external system. Requires synchronous HITL approvalYesisolate_host, block_ip, disable_account, revoke_sessions, revoke_api_keys, append_waf_rule
destructiveL3Makes a difficult-to-reverse or irreversible change. Auto-quarantines and requires synchronous HITL approvalYesrotate_credentials, terminate_process

HITL gate configuration

Two levels of human-in-the-loop gating are available:

  1. Playbook-level: setting requires_approval: true at the top level gates the entire playbook. No step executes until an operator approves the playbook invocation.

  2. Step-level: setting requires_approval: true on an individual step renders an inline approval card in the incident view at /incidents/{id}/approvals. The operator can approve or reject without leaving the incident context.

See the approval workflow runbook for the full operator workflow.

Rollback pattern

Every actuate, destructive, and write_external step must declare a rollback_action and rollback_params so the orchestrator can automatically undo the action if the playbook is aborted or if a HITL gate is rejected.

Actuate step rollback:

- name: isolate_host_nftables
action: isolate_host
action_class: actuate
requires_approval: true
parameters:
target_host: "{{ alert.affected_assets[0] }}"
isolation_mode: nftables_drop_egress
preserve_management_subnet: true
rollback_action: isolate_host
rollback_params:
action: release
audit:
description: "Host isolation rolled back after false-positive alert"
timeout: 120

Write-external notification rollback:

- name: page_oncall
action: notify
action_class: write_external
parameters:
channel: pagerduty
urgency: high
rollback_action: notify
rollback_params:
action: send_correction
channel: pagerduty
urgency: low
template: |
Alert was a false positive. No action was necessary.
audit:
description: "Correction page sent after false-positive alert"
timeout: 30

Validation

Playbooks can be validated before deployment against the AuroraSOC API:

POST /api/v1/playbooks/validate
Content-Type: application/x-yaml

<playbook YAML body>

The endpoint returns 200 with a validation result or 422 with a list of schema errors. Common validation errors include:

  • Missing rollback_action on an actuate, destructive, or write_external step
  • Unknown action_class value
  • Circular condition references between steps
  • Missing required fields (name, action, action_class, parameters, timeout)

Minimal complete example

name: example-brute-force-response
description: >
Multiple failed SSH login attempts detected. Lock the source
IP at the network edge and notify the on-call analyst.
severity_filter: high
enabled: true
requires_approval: false
mitre_techniques:
- T1110.001

steps:
- name: enrich_source_ip
action: enrich_ioc
action_class: read
parameters:
sources:
- threatfox
- alienvault_otx
ioc_field_path: alert.iocs[type=ipv4][0]
timeout: 30

- name: block_source_ip
action: block_ip
action_class: actuate
requires_approval: true
parameters:
ip_field_path: alert.iocs[type=ipv4]
ttl_seconds: 3600
reason: "SSH brute force detected"
rollback_action: block_ip
rollback_params:
action: unblock
audit:
description: "IP block rolled back after false-positive brute force alert"
condition: "previous.enrich_source_ip.status == 'completed'"
timeout: 30

- name: notify_oncall
action: notify
action_class: write_external
parameters:
channel: slack
slack_channel: "#sec-oncall"
template: |
SSH brute force blocked: {{ alert.iocs[type=ipv4][0] }}
Case: {{ case.id }}
rollback_action: notify
rollback_params:
action: send_correction
channel: slack
slack_channel: "#sec-oncall"
template: |
SSH brute force alert retracted. False positive.
Case: {{ case.id }}
audit:
description: "Correction sent to #sec-oncall after false-positive alert"
condition: "previous.block_source_ip.status == 'completed'"
timeout: 30

Best practices

  • Always include rollback_action on actuate, destructive, and write_external steps
  • Use condition guards to sequence steps that depend on previous step output
  • Keep stage comments with # dividers for readability
  • Use {{ }} template interpolation instead of hardcoding values
  • Place HITL gates on actuate and destructive steps, not on read or write_internal
  • Include MITRE technique IDs for traceability in the SOC audit trail
  • Validate playbooks via the API before enabling in production
  • Use snake_case for step names, kebab-case for playbook names
  • Set conservative timeouts (30-60 s for reads, 90-120 s for evidence collection)
  • Reference previous.STEP_NAME.result.FIELD for data flow between steps