Openclaw Incident Response: What Happens When Your Agent Goes Wrong

Q: How do I immediately stop a runaway Openclaw agent?

Run openclaw stop to halt the process, then disconnect it from all channels (Slack, Discord, Telegram). If the agent runs as a systemd service, use systemctl stop openclaw. Revoke any external API keys the agent holds before investigating further.

Q: How do I prevent my agent from taking unauthorized actions?

Use Openclaw's approval mode (agents.defaults.approval: always) for any action that modifies data, sends communications, or spends money. Combine this with a tool allowlist (plugins.allow) that only includes tools the agent needs for its specific workflow. Review the allowlist quarterly.

Q: How do I set up confirmation rules to prevent mistakes?

Edit your Openclaw configuration to set agents.defaults.approval per action type. The safest starting point is confirming all write operations while auto-approving reads. As you build confidence in specific workflows, you can selectively relax confirmation requirements for well-tested, low-risk actions.

Consider an Openclaw agent running a nightly data sync that deletes 340 records from a staging database because the prompt says “clean up stale entries” without defining what “stale” means. The agent interprets records older than 30 days as stale. The team discovers the damage six hours later when a morning report returns empty. They have no kill switch procedure, no rollback plan, and no way to trace exactly which records were affected.

Scenarios like this are not theoretical. Agents that connect to real systems and execute real actions will eventually take a wrong action, hit an infinite loop, leak data to the wrong channel, or run up a four-figure API bill overnight. The question is not whether it happens but whether you have a response plan ready when it does. This guide is that plan: a step-by-step incident response playbook for Openclaw covering failure detection, emergency shutdown, rollback, post-incident review, and prevention.

The Five Failure Types You Will Actually See

Not every agent failure looks the same, and the response depends on identifying which type you are dealing with. Here are the five categories that account for nearly every Openclaw incident in production.

Wrong Action

The agent does something you did not intend. It sends an email to the wrong recipient, modifies a file it should have read, or interprets an ambiguous instruction in a way that causes damage. This is the most common failure in deployments where prompts are underspecified. One DEV Community analysis documented an agent that misinterpreted “protect the environment” as a file deletion command.

Wrong-action failures are dangerous because the agent reports success. There is no error in the logs. The damage only surfaces when a human checks the output.

Silent Failure and Agent Hang

The agent stops responding entirely. According to GitHub issue #8288, a failed tool call can cause the agent to hang for up to 600 seconds with no timeout, no recovery, and no fallback. The user sees nothing. The logs may show nothing at the default level.

The v2026.3.2 release introduced a tool dispatch regression where agents could respond to messages but could not execute any tools, because permissions switched from opt-out to opt-in without warning.

Data Leak

The agent sends sensitive information to an unintended destination. This can happen through misconfigured channel routing, overly broad tool permissions, or compromised skills. Reco’s security analysis found 21,639 publicly exposed Openclaw instances and documented a breach where 35,000 email addresses and 1.5 million agent API tokens were compromised through the Moltbook platform.

The risk multiplies with integrations. An agent connected to Slack, Gmail, and a CRM has three potential leak vectors instead of one.

Infinite Loop and Context Explosion

The agent enters a cycle where it repeatedly calls the same tool, retries a failed action, or accumulates context until the window overflows. Each screenshot in a computer-use session consumes 1,000 to 2,000 tokens. After 8 to 10 steps, the conversation can reach 30,000 tokens and exceed model limits, causing the agent to freeze or produce garbage output.

Infinite loops are expensive even when they do not cause functional damage. A loop running overnight against a pay-per-token API can generate a bill that dwarfs your monthly infrastructure cost.

Cost Spike

A subtler failure mode. The agent works correctly but consumes far more resources than expected. Long-context requests to Anthropic models require extra usage eligibility, and an agent that inadvertently sends full documents as context on every call can multiply your API spend by 10x in a single afternoon.

Cost spikes often go undetected until the invoice arrives because there is no built-in spending alert in Openclaw.

Emergency Response: Kill Switches

When you detect an incident, the first priority is stopping the agent from causing further damage. Speed matters more than diagnosis at this stage.

Immediate Shutdown (Severity: Critical)

For an agent that is actively causing harm, execute these steps in order:

Stop the agent process. Run openclaw stop or kill the process directly. If the agent runs as a systemd service: systemctl stop openclaw.
Disconnect channels. Remove the agent from Slack, Discord, Telegram, or whichever channel it operates on. This prevents queued messages from triggering new actions while you investigate.
Revoke API keys. If the agent has credentials for external services (AWS, Stripe, a CRM), rotate those keys immediately. Do not wait to determine whether the credentials were misused.
Disable cron jobs. Run openclaw cron disable --all to prevent scheduled tasks from firing during the incident.

Partial Shutdown (Severity: High)

If the agent is misbehaving but not causing active damage, you can restrict it instead of killing it:

Disable specific tools. Edit your configuration to remove the offending tool from plugins.allow. This stops the agent from executing that tool while keeping other functions running.
Enable approval mode. Set agents.defaults.approval: always so every action requires human confirmation before execution. The agent continues to reason but cannot act autonomously.
Switch to a smaller model. Route the agent to a cheaper, less capable model temporarily to reduce both risk and cost while you investigate.

Cost Emergency

If you discover a cost spike in progress:

Set provider spending limits. Most API providers (OpenAI, Anthropic, Google) allow hard spending caps. Set one immediately.
Reduce context window. Lower LLM_REQUEST_TIMEOUT to 30 seconds and reduce the max token limit in your provider configuration.
Pause non-critical agents. If you run multiple agents, shut down everything except mission-critical workflows.

Rollback Procedures

After stopping the agent, you need to undo whatever damage it caused. The approach depends on what the agent touched.

Configuration Rollback

If the agent modified its own configuration or broke its own setup:

Restore from your most recent configuration backup. If you followed the setup in our Openclaw backup and restore guide, you have timestamped snapshots.
Run openclaw doctor to verify the restored configuration is valid.
Restart the agent and confirm it responds correctly with a safe test message.

Data Rollback

If the agent modified external data (database records, files, spreadsheets):

Identify the blast radius. Check the agent’s memory files and logs (OPENCLAW_LOG_LEVEL=DEBUG) to trace exactly which records or files were touched.
Restore from your data source’s own backup system. This is the database snapshot, the Git history, or the version history in Google Sheets.
If no backup exists, reconstruct from the agent’s action log. Every tool call is logged when debug logging is enabled.

Communication Rollback

If the agent sent messages to the wrong people:

Delete or retract the messages through the channel’s API where possible (Slack supports message deletion, email does not).
Send a correction to affected recipients explaining the error.
Document what was sent, to whom, and what data was exposed for your incident review.

The critical lesson here: rollback is only possible if you have backups. An agent operating on live data with no snapshot strategy turns every wrong action into permanent damage.

Post-Incident Review Template

After you have contained the incident and rolled back the damage, run a structured review. This is not about blame. It is about understanding the failure chain so you can prevent the next incident. This template is adapted from standard DevOps blameless postmortem practice because the principles transfer directly to AI agent operations.

Use this template:

Incident title: [Brief description] Date/time: [When detected, when resolved] Severity: Critical / High / Medium / Low Failure type: Wrong action / Silent failure / Data leak / Infinite loop / Cost spike

Timeline:

[HH:MM] Agent began misbehaving (estimated)
[HH:MM] Incident detected by [person/alert]
[HH:MM] Kill switch activated
[HH:MM] Root cause identified
[HH:MM] Rollback completed
[HH:MM] Agent restored to service

Root cause: [What specifically caused the failure? Be precise: “The prompt said X, the agent interpreted it as Y because Z.”]

Blast radius: [What systems, data, or people were affected?]

Detection gap: [How long between the failure starting and someone noticing? Why?]

What went well:

[Things that worked during the response]

What needs to change:

[Specific improvements with owners and deadlines]

Prevention measures:

[New confirmation rules, allowlist changes, monitoring additions]

File each review in a shared location your team can reference. Over time, these reviews become your most valuable resource for understanding how your agents fail and how to prevent recurrence.

Prevention: Confirmation Rules, Allowlists, and Guardrails

The best incident is one that never happens. Openclaw provides several mechanisms for preventing agent failures before they occur.

Confirmation Rules

Set agents.defaults.approval to require human confirmation for high-risk actions. You can configure this granularly:

Always confirm for actions that modify external data (database writes, file deletions, sending emails)
Auto-approve for read-only actions (data lookups, report generation, status checks)
Confirm above threshold for actions involving spending (API calls above a certain token count)

The single highest-value prevention measure is requiring confirmation on any action that is not reversible. Reads are safe. Writes need a human in the loop until you have months of reliable operation proving otherwise.

Tool Allowlists

Restrict which tools the agent can use by configuring plugins.allow as an explicit allowlist rather than relying on the default (which, since v2026.3.2, is opt-in anyway). Only grant tools the agent actually needs for its specific workflow. An agent that sends daily reports does not need file system access.

Review your allowlist quarterly. Remove tools the agent has not used in the past 30 days.

Skill Vetting

The ClawHub marketplace contains over 2,800 skills, and 12% of them have been confirmed malicious. Before installing any community skill:

Read the skill’s source code. It is open source for a reason.
Check the skill author’s other contributions and reputation.
Test in an isolated environment before deploying to production.
Pin the skill version and review changelogs before updating.

Spending Caps

Set spending limits at the API provider level and monitor usage daily. A reasonable approach:

Set a hard monthly cap at 2x your expected spend
Set a daily alert at 1.5x your average daily spend
Review any day where spend exceeds the daily alert within 24 hours

Log Everything

Enable debug logging (OPENCLAW_LOG_LEVEL=DEBUG) for production agents. The storage cost is negligible compared to the diagnostic value. When an incident occurs, the difference between having logs and not having them is the difference between a 30-minute resolution and a 6-hour guessing game.

Frequently Asked Questions

What are the most common Openclaw agent failures?

Silent failures and agent hangs are the most frequent. The agent stops responding after a tool call fails, and without debug logging enabled, there is no indication of what went wrong. Wrong-action failures are the most damaging because the agent reports success while doing something unintended.

How do I immediately stop a runaway Openclaw agent?

Run openclaw stop to halt the process, then disconnect it from all channels (Slack, Discord, Telegram). If the agent runs as a systemd service, use systemctl stop openclaw. Revoke any external API keys the agent holds before investigating further.

How do I roll back changes my agent made?

It depends on what was changed. For configuration damage, restore from a timestamped backup and validate with openclaw doctor. For data changes, use your data source’s native backup or version history. For sent messages, delete them through the channel API and send corrections. Rollback requires that backups exist before the incident.

What should I include in a post-incident review?

At minimum: a timeline (when the failure started, when it was detected, when it was resolved), the root cause stated precisely, the blast radius (what was affected), the detection gap (how long it went unnoticed), and specific prevention measures with owners and deadlines. Skip blame. Focus on the system.

How do I prevent my agent from taking unauthorized actions?

Use Openclaw’s approval mode (agents.defaults.approval: always) for any action that modifies data, sends communications, or spends money. Combine this with a tool allowlist (plugins.allow) that only includes tools the agent needs for its specific workflow. Review the allowlist quarterly.

How do I detect an API cost spike before it gets expensive?

Set a hard monthly spending cap at your API provider and configure a daily spend alert at 1.5x your average. Openclaw does not have built-in spend monitoring, so you must use your provider’s dashboard or billing API. Check alerts within 24 hours, every time.

What do I do if my Openclaw agent leaks sensitive data?

Activate the kill switch immediately. Revoke all API keys and OAuth tokens the agent holds. Identify what data was exposed, to whom, and through which channel. If personal data was leaked, you may have legal notification obligations depending on your jurisdiction. Document everything for your incident review.

How do I set up confirmation rules to prevent mistakes?

Edit your Openclaw configuration to set agents.defaults.approval per action type. The safest starting point is confirming all write operations while auto-approving reads. As you build confidence in specific workflows, you can selectively relax confirmation requirements for well-tested, low-risk actions.

Key Takeaways

Agent failures are not bugs to debug. They are operational incidents that need detection, containment, rollback, and review.
The kill switch sequence is: stop the process, disconnect channels, revoke credentials, disable cron. Memorize it or print it.
Rollback only works if you have backups. An agent on live data with no snapshot strategy means every wrong action is permanent.
Post-incident reviews are the highest-leverage activity. Each review makes the next incident less likely and less damaging.
Prevention starts with requiring human confirmation on any irreversible action and restricting tools to only what each agent needs.

If you are running Openclaw agents in production and want help building an incident response framework tailored to your deployment, SFAI Labs builds and manages AI agent infrastructure for teams that need their agents to be reliable, not just functional.