Request Demo Contact

Back to Insights

security2026-01-241 min read

Incident Response for LLM Agents

Runbooks for misfires—containment, rollback, evidence capture, and post-incident improvements.

title: Incident Response for LLM Agents

description: Runbooks for misfires—containment, rollback, evidence capture, and post-incident improvements.

date: 2026-01-24

tags: [security, reliability, operations, governance]

Incident Response cover

What counts as an “agent incident”?

Unauthorized tool call
Data exfiltration attempt
Incorrect action taken in an external system
Budget runaway (cost spike)

The 4-phase runbook

1) Detect

anomaly alerts: cost / tool error spikes / policy denies
user reported issue (support channel)

2) Contain

disable workflow or tool at policy layer
rotate tenant-scoped keys if needed
quarantine run logs and evidence

3) Eradicate

patch policy rules, tool schema, or prompt template
add regression tests for the failing case
verify with eval harness

4) Recover

re-enable under tighter budgets
add monitoring and alerts
communicate to stakeholders

Evidence capture (non-negotiable)

run envelope (policy hash, route decision)
tool call ledger
output diff vs expected
human approvals (if any)

Related insights

governance2026-01-27

Prompt Versioning and Rollbacks for Production Agents

Treat prompts like code—semantic versions, changelogs, and instant rollback when behavior shifts.

Governance2026-01-12

Governance-First Agentic AI: A Practical Blueprint

A step-by-step blueprint for governed agents: policy gates, audit evidence, risk controls, and enterprise deployment patterns.

reliability2026-01-23

Evaluation Harness for Agentic Workflows

Ship agents like software—regression tests for prompts, tools, policies, and routing decisions.

security2026-01-25

Tooling Catalog and Blast Radius Control

Treat tools as product surface area—documented schemas, permissions, and safe defaults.

security2026-01-22

Data Minimization for Agentic AI

Reduce data exposure while improving reliability—scoped retrieval, redaction, and least-privilege connectors.

governance2026-01-20

Agentic AI Operating Model for Enterprises

A practical operating model for deploying agents safely—roles, controls, runbooks, and measurable outcomes.

Pilots Demo Tour