Home About Who We Are Team Services Startups Businesses Enterprise Case Studies Blog Guides Contact Connect with Us
Back to Guides
Enterprise Software 14 min read

Why Your AI Project Budget Should Have a Model-Deprecation Reserve

Why Your AI Project Budget Should Have a Model-Deprecation Reserve

Most production AI system in 2026 will be forced through one or more model migrations during its lifetime. Vendor sunset announcements, capability cliff transitions, security advisories, and license changes routinely deprecate the underlying model an AI system was built against; and each forced migration costs 4 to 12 engineering weeks of unplanned work. Budgets that do not reserve for deprecation absorb the cost as scope creep, change orders, or silent operational degradation. The fix is paperwork: a 5 to 15 percent reserve in most AI project budget, sized against the project’s exposure profile, with a named migration owner and a documented playbook. This piece names what triggers deprecation, decomposes the migration cost line, and prescribes how to size the reserve.

It is a spoke under the AI project economics manifesto, which argues that AI economics has shifted from feature cost to evaluation cost; and that lifecycle costs like model deprecation are core to the new cost surface, not edge cases.

The deprecation reality in 2026

Frontier model providers; Anthropic, OpenAI, Google DeepMind, Mistral, Meta; operate on aggressive model release cadences. New capability tiers ship most 6 to 12 months. Old capability tiers get deprecated 12 to 24 months after the successor stabilizes. The deprecation timeline is faster than the lifecycle of most enterprise software systems, and meaningfully faster than the budget cycles those systems are funded against.

OpenAI has retired multiple GPT generations across 2024 and 2025. Anthropic has cycled through Claude generations on a similar cadence with documented sunset dates. Google has deprecated Gemini variants. Across the field, the pattern is consistent: a model that was state-of-the-art 18 months ago is no longer available in the API by month 24, and even when available, is no longer cost-competitive against the current generation.

Three properties make AI model deprecation different from traditional software dependency upgrades. First, the deprecation is non-optional; vendors set sunset dates and the API stops working. Second, the migration is not a drop-in replacement; prompts that worked on the old model often need re-engineering for the new model. Third, the eval suite has to be re-run from scratch; the new model’s behavior distribution is different, and previously-passing test cases will shift.

A budget that does not reserve for these forced migrations is a budget that absorbs them as scope creep. The reserve is not optional, in the same way that the deprecation is not optional.

Four triggers of forced migration

The reserve has to cover four distinct triggers, each with different lead time and different remediation cost.

Trigger 1: Vendor sunset announcement. The vendor announces a model will be retired on a specific date. Lead time is typically 6 to 18 months. Action required: re-eval the system on the successor model, adjust prompts, re-lock thresholds, and ship the migration before the sunset date. This is the most common deprecation pattern in 2026 and the easiest to plan for, because the lead time is generous and the migration target is named.

Trigger 2: Capability cliff. A new model generation makes a meaningful capability available (long context, better reasoning, multimodal, lower latency, lower cost) that the buyer’s product roadmap depends on. The old model still works, but the system cannot ship the next feature without the upgrade. Lead time is project-driven, not vendor-driven. Action required: same re-eval and migration as trigger 1, but on a faster timeline because the business has committed to the capability.

Trigger 3: Security advisory. A vulnerability is disclosed in the model; prompt-injection class, training data leak, jailbreak chain; that affects production systems. Lead time is days to weeks. Action required: emergency migration, patch, or compensating controls. The fastest deprecation trigger and the most expensive per-incident.

Trigger 4: License change. The model’s license, terms of service, data-use clause, or pricing structure changes in a way that breaks the buyer’s compliance posture or unit economics. Lead time is 30 to 90 days, depending on contract. Action required: switch providers, restructure the integration, or re-negotiate terms. Less common than sunsets but harder to recover from when it happens.

A reserve sized for sunset-only is structurally underprovisioned. The reserve has to cover the sum of the four triggers’ expected cost over the project lifecycle.

What migration costs

Each migration decomposes into four cost lines.

Re-eval (40 to 60 percent of migration cost). The full eval suite re-runs against the new model. Regressions surface (typically 5 to 15 percent of test cases shift behavior). Each regression requires triage: read reasoning traces, hypothesize cause, validate, decide on remediation. Detailed in the hidden cost of AI evals.

Prompt rewrite (20 to 30 percent). Prompts that depended on quirks of the old model; specific tokenization patterns, formatting preferences, instruction-following idioms; break or degrade on the new model. Senior engineering time on prompt re-engineering, with eval validation per change. The cost scales with how many prompts the system has and how prompt-engineered the old setup was.

Retrieval and tool-use adjustment (10 to 20 percent). Retrieval-augmented systems often have prompts and chunking strategies tuned to the old model’s context window and attention patterns. Tool-use systems have call-pattern conventions tuned to the old model’s tool-calling behavior. Both need adjustment, plus eval validation that the adjusted system still passes the threshold.

Threshold re-lock and stakeholder communication (10 to 20 percent). Some test cases regress; some improve. The new threshold has to be locked, communicated to the buyer, and signed off. Stakeholder communication is typically underestimated; the buyer wants to understand what changed, why, and what the new performance baseline means for their product.

Total cost per migration. Two to four engineering weeks for a well-instrumented system. Six to twelve engineering weeks for a poorly-instrumented system (no eval suite, no prompt registry, no telemetry on which prompts matter). The variance between well-instrumented and poorly-instrumented is the one of the largest predictor of migration cost.

How to size the deprecation reserve

The reserve is sized as a percentage of total project cost over the project’s expected lifecycle. Three factors set the percentage.

Factor 1: Expected lifecycle length. A 12-month system will encounter 1 to 2 forced migrations (typically one major sunset, plus a possible capability-cliff or security-advisory event). A 24-month system encounters 2 to 4 migrations. A 36-month system encounters 3 to 6 migrations. Reserve size scales roughly linearly with lifecycle length.

Factor 2: System exposure profile. Single-model systems with one tightly-coupled prompt library are high-exposure. Multi-model systems with abstracted model routing (LiteLLM proxy, OpenRouter integration) are low-exposure. The high-exposure case carries 1.5x to 2x the migration cost of the low-exposure case for the same lifecycle.

Factor 3: Regulatory and risk profile. Regulated systems (healthcare, finance, legal) have higher migration cost because the eval suite is more rigorous and the stakeholder-communication line is heavier. Internal-facing tools have lower migration cost. The variance is roughly 1.5x.

The sizing table.

Project profileReserve sizeRationale
Internal tool, abstracted model routing, 12-month lifecycle5 percentLow exposure, short lifecycle
Customer-facing system, single-model, 12-month lifecycle8 percentMedium exposure
Customer-facing system, single-model, 24-month lifecycle12 percentMultiple migrations
Regulated production system, single-model, 24-month lifecycle15 percentHigh exposure, regulated rigor

Lower than 5 percent is structurally underprovisioned. Higher than 15 percent is usually a sign the system architecture should be refactored toward model abstraction rather than budgeting for the cost of not abstracting.

The migration playbook

A documented playbook turns a 12-week unplanned migration into a 4-week planned migration. The playbook has six stages.

Stage 1: Trigger detection (week 0). Vendor sunset announcement, capability decision, security advisory, or license change is logged. The migration owner (named in the SOW or retainer) opens a migration project.

Stage 2: Migration scoping (week 1). Catalog most model integration point in the system; most prompt, most tool integration, most retrieval call. For each, document what the new model is, what the eval implication is, and what the migration scope is. Output: a scoped migration plan with effort estimate.

Stage 3: Eval baseline on the new model (week 1 to 2). Run the existing eval suite on the new model with no other changes. Catalog the regressions. Categorize: prompt-fixable, retrieval-fixable, threshold-relockable, or accept-as-trade-off.

Stage 4: Migration implementation (week 2 to 4). Implement the prompt, retrieval, and tool-use changes per the migration plan. Eval validation per change. Threshold re-lock decisions documented and signed off by the buyer.

Stage 5: Stakeholder communication (week 4). Migration report to the buyer covering what changed, what the new performance baseline is, what residual risks exist, and what was decided as a trade-off. Signed sign-off triggers the migration as complete.

Stage 6: Post-migration retrospective (week 5). Document what surprised the team, what the actual migration cost was against the estimate, and what playbook updates are needed for the next migration. The retro is what turns the reserve from a contingency line into a learning line.

The playbook converts deprecation from a recurring fire-drill into a planned operational cadence. Mature 2026 retainers include the playbook as named scope and trigger it on each detected migration.

Reserve governance: what finance verifies

Finance teams responsible for AI project budgets should verify five things on the deprecation reserve.

One. Reserve is sized against an explicit lifecycle assumption (12, 24, 36 months) with the rationale documented.

Two. Reserve is named as a separate line in the budget, not absorbed into a generic contingency. Contingency gets cut on review; named reserves get defended.

Three. A named migration owner exists in the SOW or retainer. The role does not have to be a separate hire, but it has to be named.

Four. A migration playbook is referenced in the SOW or retainer. The playbook does not have to be perfect, but it has to exist.

Five. Quarterly review of reserve consumption against expected. If the reserve is not being drawn down (no migrations occurred in the period), the reserve rolls forward. If the reserve is being drawn down faster than expected, the sizing is reviewed and the system architecture is reviewed for abstraction opportunities.

Reserve governance is light-weight discipline. The cost is paperwork. The cost of not having reserve governance is the deprecation cost surfacing as scope creep when the next sunset announcement lands.

Frequently asked questions

How often do AI models get deprecated?

Frequently. OpenAI has retired multiple GPT generations across 2024 and 2025. Anthropic has cycled Claude generations on a 12 to 18 month cadence. Google has deprecated Gemini variants. Across the field, expect a sunset announcement affecting an actively-used model most 12 to 24 months, with shorter intervals if the system is using a capability that the vendor is iterating on aggressively (long context, multimodal, reasoning).

Why can’t a project just pick a stable model and stay on it?

Two reasons. First, stable models stop being competitive. The successor generation is usually 30 to 60 percent cheaper for the same task and meaningfully better on the eval bar; staying on the old model means paying more for worse output. Second, vendor sunsets are non-optional. Even if the buyer prefers to stay, the API stops working on the announced sunset date. The choice is migrate proactively (planned, lower cost) or migrate reactively (emergency, higher cost).

Is 5 to 15 percent enough reserve?

For most projects, yes. The 5 to 15 percent range is sized against the cost of 1 to 4 migrations over the project lifecycle, weighted by the system’s exposure profile. Projects with unusually high exposure (multiple highly prompt-engineered single-model integrations, regulated environments, long-tail prompt libraries) can justify reserves of 15 to 25 percent. Projects below 5 percent are usually under-instrumented systems that are accumulating migration debt invisibly.

What’s the fastest way to reduce migration cost?

Model abstraction. A system that calls models through an abstraction layer (LiteLLM, OpenRouter, custom router) has structurally lower migration cost than a system that calls vendor SDKs directly throughout the codebase. The abstraction is a 1 to 2 week investment that pays back on the first migration. Detailed in the AI project FinOps playbook under model routing.

Should the deprecation reserve be in the build budget or the maintenance retainer?

The maintenance retainer, with a named clause covering up to N migrations per year. Build budgets are scoped against the build deliverable; deprecation events happen post-launch and are operational. A retainer with a named migration clause is the right home for the reserve because it aligns the cost with when it gets spent.

What if the vendor extends the sunset date?

The reserve rolls forward. The deprecation reserve is not a use-it-or-lose-it line; it is a planned investment against expected migration events over the project lifecycle. If a sunset is extended, the reserve covers the next event instead. The sizing is against the expected number of events over the lifecycle, not against any specific event.

How does this relate to model-upgrade re-eval cost?

Model-upgrade re-eval is one of the four lines a migration decomposes into. A planned model upgrade (a project chooses to migrate to a better model) and a forced deprecation migration (the vendor sunsets) have similar cost decompositions but different triggers. The deprecation reserve covers forced migrations specifically; the model-upgrade re-eval line covers both forced and chosen migrations. Treat them as adjacent budget lines that share an owner.

What’s the right size if the project does not yet have an eval suite?

Higher than 15 percent; likely 20 to 30 percent. Without an eval suite, most migration is closer to a from-scratch validation than a re-baseline, and the cost variance is much larger. The reserve sizing assumes the project has eval discipline. Projects without eval discipline should fix that first; the reserve sizing is a downstream optimization.

Key takeaways

  • Most production AI system in 2026 will be forced through one or more model migrations. The triggers; vendor sunset, capability cliff, security advisory, license change; are non-optional and have lead times from days to 18 months.
  • Each migration costs 2 to 12 engineering weeks of unplanned work. Cost decomposes into re-eval (40 to 60 percent), prompt rewrite (20 to 30 percent), retrieval and tool-use adjustment (10 to 20 percent), threshold re-lock and stakeholder communication (10 to 20 percent).
  • The reserve is sized at 5 to 15 percent of total project cost, scaled by lifecycle length, system exposure profile, and regulatory profile. Lower than 5 percent is structurally underprovisioned; higher than 15 percent usually means the architecture should be refactored toward model abstraction.
  • A documented migration playbook turns a 12-week unplanned migration into a 4-week planned migration. The playbook has six stages: trigger detection, migration scoping, eval baseline on new model, migration implementation, stakeholder communication, post-migration retrospective.
  • Reserve governance is light-weight: named line in the budget (not absorbed into contingency), named migration owner in the SOW, playbook referenced in the SOW or retainer, quarterly review of consumption against expected.

Model deprecation is not optional. The reserve is not optional. The cost of reserving for it is paperwork. The cost of not reserving for it is the migration surfacing as scope creep on the next sunset announcement.

Last Updated: Jun 10, 2026

AW

Arthur Wandzel

SFAI Labs helps companies build AI-powered products that work. We focus on practical solutions, not hype.

See how companies like yours are using AI

  • AI strategy aligned to business outcomes
  • From proof-of-concept to production in weeks
  • Trusted by enterprise teams across industries
Get in Touch →
No commitment · Free consultation

Related articles