Openclaw Multi-Model Configuration: GPT, Claude, and Gemini Fallbacks

Q: How do I set up model fallbacks in OpenClaw?

Add fallbacks via CLI with openclaw models fallbacks add provider/model-name for each model you want in the chain. They activate in the order you add them. Alternatively, edit openclaw.json5 and list models in the agents.defaults.model.fallbacks array. Always add your fallback models to the allowlist to avoid the known session switch bug.

Q: Can I use different models for different tasks?

Yes. OpenClaw supports dedicated models for heartbeats via heartbeatModel, for image generation via imageGenerationModel, and per-channel routing if you run OpenClaw across multiple messaging platforms. You can also switch models mid-session with the /model command.

Q: Does OpenClaw work with Ollama and local models?

Yes. Install Ollama, pull your model, and configure OpenClaw to point at the local endpoint. The critical setting most people miss: you must include api openai-responses in the provider config. Set baseUrl to http://127.0.0.1:11434/v1. Docker users should replace 127.0.0.1 with host.docker.internal.

Q: How do I check which model OpenClaw is currently using?

Run /model status inside any chat session for a detailed view showing the active model, auth profile, and whether a fallback is active. From the CLI, openclaw models status shows the configured primary and fallback chain.

Running OpenClaw on a single model is like owning one screwdriver. It works until it doesn’t, and then your agent sits idle while you scramble for an API key. A properly configured fallback chain keeps your agent running through rate limits, outages, and provider hiccups, and it can cut your monthly API bill by 60% or more if you route tasks to the right model tier.

This guide covers the full multi-model setup: provider configuration, fallback chains, per-task routing, cost optimization, and the known bugs that will bite you if you don’t plan for them. If you haven’t installed OpenClaw yet, start with our setup guide and come back here once your agent is running on a single provider.

Supported Providers and Models

OpenClaw supports direct API connections to every major model provider, plus aggregators like OpenRouter for access to dozens more through a single key.

Tier 1 providers (direct API support):

Anthropic: Claude Opus 4.6, Claude Sonnet 4.6
OpenAI: GPT-5.4
Google: Gemini 3.1 Pro Preview, Gemini 2.0 Flash
DeepSeek: DeepSeek R1, DeepSeek V3.2

Aggregators and local options:

OpenRouter: Access to 100+ models through one API key
Ollama: Run open-weight models locally (Llama 4 Scout, Mistral, etc.)
LM Studio: Local model hosting with OpenAI-compatible API

Each provider needs its own API key stored as an environment variable. Set them before configuring models:

export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
export GEMINI_API_KEY="AIza..."
export OPENROUTER_API_KEY="sk-or-..."

Or add them to your .env file in the OpenClaw directory. Never hardcode keys in openclaw.json5.

Setting Up Your Primary Model

The primary model handles every request unless it fails. Pick the strongest model you can afford for your main use case.

Set it via CLI:

openclaw models set anthropic/claude-sonnet-4-6

Or edit openclaw.json5 directly:

{
  agents: {
    defaults: {
      model: {
        primary: "anthropic/claude-sonnet-4-6"
      }
    }
  }
}

Verify it took effect:

openclaw models status

This shows the active model, authentication status, and any OAuth token expiry dates. If the model doesn’t change, restart the gateway with openclaw gateway restart and start a fresh session with /new.

We’ve found that Claude Sonnet 4.6 hits the best balance of capability and cost for most deployments. Opus 4.6 is stronger for complex reasoning but costs roughly 5x more per token. GPT-5.4 is a solid alternative if your workflow leans heavily on function calling or structured outputs.

Configuring the Fallback Chain

Fallbacks activate when the primary model fails. OpenClaw tries each fallback in order until one responds.

CLI Method

openclaw models fallbacks add openai/gpt-5.4
openclaw models fallbacks add google/gemini-3.1-pro
openclaw models fallbacks add openrouter/deepseek/deepseek-r1

Check the chain:

openclaw models fallbacks list

Config File Method

For precise control, edit openclaw.json5:

{
  agents: {
    defaults: {
      model: {
        primary: "anthropic/claude-sonnet-4-6",
        fallbacks: [
          "openai/gpt-5.4",
          "google/gemini-3.1-pro",
          "openrouter/deepseek/deepseek-r1"
        ]
      }
    }
  }
}

What Triggers a Fallback

OpenClaw falls back to the next model when:

Rate limits: The provider returns HTTP 429
Auth failures: Invalid or expired API key or OAuth token
Timeouts: No response within the configured timeout window
Server errors: Provider returns 500/502/503

Before falling back to a different model, OpenClaw first rotates through all authentication profiles for the current provider. If you have three API keys for Anthropic, it tries all three before moving to GPT-5.4.

The cooldown progression is aggressive: 1 minute after the first failure, then 5 minutes, 25 minutes, and caps at 1 hour. Billing failures get even longer cooldowns, starting at 5 hours. This means a rate-limited provider won’t be retried immediately, giving it time to recover.

Model Strengths by Task Type

Not every model excels at the same things. After deploying OpenClaw across dozens of client environments, here is where each model consistently performs best:

Task Type	Recommended Model	Why
Complex reasoning and coding	Claude Opus 4.6	Strongest at multi-step logic, large codebases
General-purpose agent work	Claude Sonnet 4.6	Best cost-to-capability ratio for daily tasks
Function calling and structured output	GPT-5.4	Most reliable JSON schema adherence
Multimodal analysis (images, PDFs)	Gemini 3.1 Pro Preview	Largest context window, strong vision
High-volume simple tasks	Gemini 2.0 Flash	Fast, cheap, good enough for lookups
Heartbeats and health checks	DeepSeek V3.2 or Gemini Flash-Lite	Costs under $0.50/M tokens
Chinese language content	Doubao Pro (Volcano Engine)	Native Chinese language model
Privacy-sensitive deployments	Ollama with Llama 4 Scout	Runs entirely on your hardware

The mistake most teams make: running Opus or GPT-5.4 for everything, including heartbeats that fire every 30 minutes. That is roughly $0.03 per heartbeat on Opus versus $0.0002 on Flash-Lite. Over a month of continuous operation, that is the difference between $43 and $0.29 for heartbeats alone.

Cost-Optimized Routing

The real savings come from routing different task types to different model tiers. OpenClaw supports this through per-agent model overrides and dedicated heartbeat model configuration.

Dedicated Heartbeat Model

Heartbeats run every 30 minutes by default and consume tokens on simple status checks. Assign a budget model:

{
  agents: {
    defaults: {
      model: {
        primary: "anthropic/claude-sonnet-4-6",
        fallbacks: ["openai/gpt-5.4", "google/gemini-3.1-pro"]
      },
      heartbeatModel: {
        primary: "google/gemini-2.0-flash",
        fallbacks: ["openrouter/deepseek/deepseek-v3.2"]
      }
    }
  }
}

For more on configuring the heartbeat itself, see our heartbeat scheduling guide.

Monthly Cost Projections

Assuming a moderately active agent (roughly 2M tokens/day input, 500K tokens/day output):

Strategy	Monthly Input Cost	Monthly Output Cost	Total
All Claude Opus 4.6	~$900	~$450	~$1,350
All Claude Sonnet 4.6	~$180	~$90	~$270
Sonnet primary + Flash heartbeats	~$150	~$75	~$225
Tiered (Sonnet/Flash/Flash-Lite)	~$100	~$50	~$150

The tiered approach saves roughly 89% compared to running Opus for everything. Even switching from Opus to Sonnet as your primary saves over $1,000 per month at this usage level. For detailed cost breakdowns, see our API costs guide.

Known Fallback Bugs and Workarounds

Three bugs in the current fallback system trip up most deployments. Knowing about them saves you hours of debugging.

Bug 1: Fallback Infinite Loop (GitHub #57760)

When the primary model hits a rate limit, the system sometimes enters a loop: it identifies the next fallback candidate, but then the LiveSessionModelSwitchError mechanism rejects the switch and reverts to the failed primary. The cycle repeats indefinitely.

Workaround: Add all your fallback models to the allowlist explicitly. This ensures the session model switch is permitted:

{
  agents: {
    defaults: {
      model: {
        primary: "anthropic/claude-sonnet-4-6",
        fallbacks: ["openai/gpt-5.4", "google/gemini-3.1-pro"]
      },
      models: {
        "anthropic/claude-sonnet-4-6": { alias: "Sonnet" },
        "openai/gpt-5.4": { alias: "GPT" },
        "google/gemini-3.1-pro": { alias: "Gemini" }
      }
    }
  }
}

Bug 2: Fallback Overwrites Primary (GitHub #47705)

When a fallback model handles a request successfully, it sometimes gets written back to openclaw.json as the new primary. Your agent silently switches to the fallback model permanently, and the original primary is never retried.

Workaround: After configuring your models, set the config file to read-only:

chmod 444 ~/.config/openclaw/openclaw.json5

Remove the lock when you need to make intentional changes. This is blunt but effective until the bug is patched.

Bug 3: CLI Allowlist Exclusion (GitHub #20265)

Running openclaw models fallbacks add populates the allowlist with only the fallback models, excluding your primary. This silently breaks sessions and cron jobs that rely on the primary model.

Workaround: Always add your primary model to the allowlist after adding fallbacks:

openclaw models fallbacks add openai/gpt-5.4
openclaw models fallbacks add google/gemini-3.1-pro
# Re-add primary to the allowlist
openclaw models set anthropic/claude-sonnet-4-6

Or configure everything in openclaw.json5 directly, where you control the full allowlist.

Complete Production Configuration

Here is a full openclaw.json5 configuration that implements tiered routing, explicit allowlisting, and all the defensive patterns described above:

{
  agents: {
    defaults: {
      model: {
        // Primary: best cost/capability ratio for general work
        primary: "anthropic/claude-sonnet-4-6",
        // Fallback chain: tried in order when primary fails
        fallbacks: [
          "openai/gpt-5.4",
          "google/gemini-3.1-pro",
          "openrouter/deepseek/deepseek-r1"
        ]
      },
      // Budget model for heartbeats (runs every 30 min)
      heartbeatModel: {
        primary: "google/gemini-2.0-flash",
        fallbacks: ["openrouter/deepseek/deepseek-v3.2"]
      },
      // Explicit allowlist prevents Bug #57760 and #20265
      models: {
        "anthropic/claude-sonnet-4-6": { alias: "Sonnet" },
        "anthropic/claude-opus-4-6": { alias: "Opus" },
        "openai/gpt-5.4": { alias: "GPT" },
        "google/gemini-3.1-pro": { alias: "Gemini" },
        "google/gemini-2.0-flash": { alias: "Flash" },
        "openrouter/deepseek/deepseek-r1": { alias: "R1" },
        "openrouter/deepseek/deepseek-v3.2": { alias: "V3" }
      }
    }
  }
}

After saving this config, restart the gateway and verify:

openclaw gateway restart
openclaw models status
openclaw models fallbacks list

You should see Sonnet as primary, the three fallbacks listed in order, and Flash as the heartbeat model. The /model status command inside a chat session gives a more detailed view including auth profile info and current cooldown states.

Frequently Asked Questions

How do I set up model fallbacks in OpenClaw?

Add fallbacks via CLI with openclaw models fallbacks add provider/model-name for each model you want in the chain. They activate in the order you add them. Alternatively, edit openclaw.json5 and list models in the agents.defaults.model.fallbacks array. Always add your fallback models to the allowlist (the models object) to avoid the known session switch bug.

What happens when my primary model fails?

OpenClaw first rotates through all authentication profiles for that provider (if you have multiple API keys). If all profiles fail, it moves to the first model in your fallback chain. The failed provider enters a cooldown starting at 1 minute, escalating to 5 minutes, 25 minutes, and capping at 1 hour. During cooldown, the fallback model handles all requests.

Can I use different models for different tasks?

Yes. OpenClaw supports dedicated models for heartbeats via heartbeatModel, for image generation via imageGenerationModel, and per-channel routing if you run OpenClaw across multiple messaging platforms. You can also switch models mid-session with the /model command for specific tasks that need a stronger or cheaper model.

How much can multi-model routing save on API costs?

Substantial amounts. A moderately active agent running all requests through Claude Opus 4.6 costs roughly $1,350/month. Switching the primary to Sonnet 4.6 and using Flash for heartbeats drops that to around $225/month. Adding a budget model for sub-agents and simple lookups can push savings past 89%. See our API costs breakdown for detailed pricing.

Why does my fallback model overwrite my primary in the config?

This is a known bug (GitHub issue #47705). When the fallback model successfully handles a request, it can get persisted back to openclaw.json as the new primary. The workaround is to set your config file to read-only with chmod 444 after configuration, or to manage all model settings through the config file rather than CLI commands during active sessions.

Does OpenClaw work with Ollama and local models?

Yes. Install Ollama, pull your model (ollama pull llama4-scout), and configure OpenClaw to point at the local endpoint. The critical setting most people miss: you must include api: "openai-responses" in the provider config. Set baseUrl to http://127.0.0.1:11434/v1. Docker users should replace 127.0.0.1 with host.docker.internal. See our Docker deployment guide for container-specific networking.

What is the best primary model for OpenClaw right now?

For most use cases in April 2026, Claude Sonnet 4.6 offers the best balance of capability, speed, and cost. Gemini 3.1 Pro Preview scores highest on the Artificial Analysis Intelligence Index (57), but Claude Sonnet handles agent workflows, tool use, and multi-step reasoning more reliably in our deployments. Use Opus 4.6 only when you need peak reasoning quality and can absorb the 5x cost premium.

How do I check which model OpenClaw is currently using?

Run /model status inside any chat session for a detailed view showing the active model, auth profile, and whether a fallback is active. From the CLI, openclaw models status shows the configured primary and fallback chain. Since the v2026.2.21 release, the /status command also indicates whether the system fell back and why.

Key Takeaways

Configure at least two fallback models from different providers to survive outages and rate limits
Use budget models for heartbeats and simple tasks to cut monthly costs by 60-89%
Add every model in your fallback chain to the explicit allowlist to avoid the infinite loop bug
Set your config file to read-only after setup to prevent the fallback overwrite bug
Claude Sonnet 4.6 as primary with GPT-5.4 and Gemini 3.1 Pro as fallbacks covers the widest range of failure scenarios