Home About Who We Are Team Services Startups Businesses Enterprise Case Studies Blog Guides Contact Connect with Us
Back to Guides
Enterprise Software 16 min read

Why AI Commoditization Will Erase 60% of Your 'Build' Decisions by 2027

Why AI Commoditization Will Erase 60% of Your 'Build' Decisions by 2027

Three forces; token-price decay, agent-template proliferation, and frontier-model leapfrogs; will commoditize 60 percent of the AI capabilities organizations are actively building in 2026 by the decline of 2027. The 60 percent is predictable: agent orchestration scaffolding, RAG pipelines, eval test runners, prompt-versioning systems, model-routing infrastructure, observability tooling for AI workloads. Many of it will be available as buy products by Q4 2027 with quality and feature surface that the in-house version cannot match. The 40 percent that survives is also predictable: proprietary data assets, domain-specific eval sets, critical-path UX that compounds with usage, integrations into customer-specific workflows, and the institutional eval discipline that decides what “good” means. The forecasting question is not whether the commoditization will happen; it is happening; but how to make 2026 build decisions that survive into 2027 instead of being erased by it. This piece names the three forces, the 60 percent that gets erased, the 40 percent that survives, and the playbook for distinguishing them in current build decisions.

This is a spoke under the AI build-vs-buy-vs-hire decision matrix for 2026. The matrix’s seventh principle is that most decision is re-litigated quarterly because conditions shift; this piece is the forward-looking forecast of the conditions about to shift over the next 12 to 18 months.

Why this is a forecast, not a guess

Predictions about AI in 18 months are usually wrong because the prediction surface is too broad. “AI will be more capable” is not a useful prediction; it is a tautology with no commitments. “Token prices will fall” is closer but still not actionable. The useful prediction is structural: which specific categories of capability will commoditize, by what mechanism, on what timeline, and what does the structural change mean for build-vs-buy decisions made today.

This forecast is structural and falsifiable. Three named forces, each with observable trajectory through 2024 to 2026, each with quantitative expression of the change expected through 2027. The 60-percent erasure number is derived from those forces by mapping 2026 build categories to which forces will reach them and on what timeline. The 40 percent that survives is the residual; the categories where the structural forces do not reach, because the underlying assets are something other than commoditizable infrastructure.

The forecast is wrong if any of the three forces stalls. Token prices flatten and the cost-driven commoditization slows. Agent templates fragment and consolidation does not happen. Frontier models hit a quality plateau and the leapfrog dynamic stops. Any one of those would invalidate part of the forecast; many three stalling would invalidate most of it. The current evidence; through Q1 2026; is that many three forces are accelerating, not stalling.

Force 1: token-price decay

The first force is the continued decay of foundation-model token prices. The trajectory is well-documented: from 2024’s frontier prices ($30/M input, $60/M output for GPT-4-tier capability) to 2026’s ($6 to $12 input, $30 to $60 output for equivalent capability) is a 60 to 80 percent decline in two years. The physics of the decline; improved inference efficiency, better hardware utilization, competitive pressure across providers; does not have an obvious floor.

The 2027 trajectory, projecting from current rates, brings frontier-tier inference to $1 to $3 per million input tokens and $5 to $15 per million output tokens. At those prices, several categories of in-house infrastructure stop being economically defensible.

The categories that fall: token-cost reduction infrastructure (compression layers, aggressive caching, custom prompt-shortening systems), self-hosted inference for non-extreme-scale workloads (the buy alternative is now cheaper than self-hosting plus maintenance), and distillation pipelines aimed primarily at cost reduction (cost reduction was the value prop; the value prop has been arbitraged).

Distillation aimed at extreme-scale or domain reasons (per the build-buy-or-fine-tune frame) survives because the underlying gates; 10M+ calls per day or genuine domain expertise; are not affected by the price decline.

Force 2: agent-template proliferation

The second force is the ongoing proliferation and consolidation of agent templates. In 2024 the agent landscape was a research zoo. In 2026 it has consolidated to a handful of production-grade frameworks (OpenAI Agents SDK, Anthropic agent harness, LangGraph, AutoGen) plus an ecosystem of vertical-specific templates that run on top.

The 2027 trajectory: by Q4 2027, an organization will be able to buy off-the-shelf agents for the 50 most common enterprise workflows. Customer support triage. Sales research. Financial document extraction. Legal contract review. Recruiting screening. HR policy assistance. Engineering ticket routing. Operations alert triage. The list is long and the templates are competing on quality, price, and ease of integration.

The categories that fall: any in-house agent that is implementing one of the 50 common workflows. The in-house build was correct in 2024 because the templates did not exist. By 2027 the templates exist, and the in-house build is a worse version of the buy alternative; fewer integrations, smaller eval set, narrower test coverage, slower feature velocity.

The categories that survive: agents on workflows that are not in the 50 common templates because they are domain-specific, customer-specific, or compose with proprietary data in ways the templates cannot. Agents that are wrapped around the org’s specific eval discipline (per the AI moat audit Question 3) rather than against generic benchmarks.

Force 3: frontier-model leapfrogs

The third force is the continued leapfrog dynamic between frontier model providers. Most 6 to 9 months, one provider releases a model that is meaningfully better than the prior frontier; within 3 months the other providers match or exceed; within another 3 months a third provider raises the bar. The cadence has been stable for 36 months and shows no sign of stopping.

The 2027 trajectory: another 2 to 3 frontier leapfrogs through end of 2027. Each leapfrog erodes a category of in-house capability that was built around the prior frontier’s specific limitations. Custom reasoning chains built to compensate for frontier weaknesses become obsolete when the next frontier handles the reasoning natively. Custom validation layers built to catch frontier hallucinations become obsolete when the next frontier hallucinates less. Custom tool-calling orchestration built around frontier tool-use limitations becomes obsolete when the next frontier handles tool-calling natively.

The categories that fall: in-house infrastructure compensating for current frontier weaknesses. The pattern is straightforward; if the in-house build is “we built X because the frontier model can’t do Y reliably enough,” the build has a 12-to-18-month obsolescence horizon, because Y is a feature the next frontier model will handle.

The categories that survive: in-house infrastructure that is independent of frontier capabilities. Anything proprietary-data-based, anything domain-eval-based, anything that depends on integration depth into customer systems. Frontier leapfrogs do not erode these because the leapfrog is a generic-capability event and these capabilities are non-generic.

The 60 percent that gets erased

Mapping the three forces to 2026 build categories produces the predictable 60 percent. The categories are:

Boring AI infrastructure (35 percent of the erased 60). Agent orchestration scaffolding. RAG pipeline construction. Eval test runners. Prompt-versioning systems. Model-routing infrastructure. Observability tooling for AI workloads. Many of these are categories where serious buy products exist in 2026 and will be substantially better by 2027. The in-house versions are accumulating engineering debt that is being paid against a falling buy alternative.

Cost-reduction infrastructure (10 percent of the erased 60). Token-cost reduction layers. Self-hosted inference for non-extreme workloads. Distillation pipelines justified primarily by cost. Per Force 1, the underlying cost is dropping faster than the in-house version can save.

Common-workflow agents (10 percent of the erased 60). In-house agents on the 50 most common enterprise workflows. Per Force 2, the templates that compete on these workflows will be ahead by Q4 2027 in feature surface, integration coverage, and eval rigor.

Frontier-compensation infrastructure (5 percent of the erased 60). In-house code compensating for current frontier limitations on reasoning, tool use, hallucination, or context handling. Per Force 3, these limitations are the focus of the next 2 to 3 leapfrogs.

The total is 60 percent of current build decisions, distributed across the four categories. The exact percentages vary by organization; orgs that have over-built boring infrastructure will see a higher 60-percent erasure; orgs that have built mostly moat-positive capabilities will see lower. The structural prediction is that the average org’s 2026 build portfolio is about 60 percent commoditizable.

The 40 percent that survives

The 40 percent that survives 2027 is structural. Four categories.

Proprietary-data assets (15 percent of the surviving 40). Capabilities built around data the org collects that competitors cannot collect. Customer-specific learning loops. Domain-specific training corpora. Time-asymmetric learning loops per the AI moat audit Question 10. The data assets are not commoditizable because they are not infrastructure; they are accumulated outputs of the org’s specific operation.

Domain-specific eval sets (10 percent of the surviving 40). The eval set that encodes the org’s specific quality bar, the org’s specific edge cases, the org’s specific user feedback. Per the AI moat audit Question 3, the eval set is one of the few load-bearing in-house assets in 2026 and remains so in 2027 because the eval set is the org’s institutional understanding of “good.”

Critical-path UX (10 percent of the surviving 40). The user-facing surface that is integrated into customer workflows, configured to customer-specific data and roles, and used many times per day per user. Per the AI moat audit Question 8, integration depth is structurally hard to commoditize because it is not a generic capability but a per-customer accumulation.

Customer-specific integrations (5 percent of the surviving 40). Integrations into specific customer systems, data sources, or workflows that competitors would have to build separately for each customer. The unit of work is per-customer, which prevents commoditization through generic templates.

The 40 percent that survives shares one property: it is not generic. The three forces; price decay, template proliferation, frontier leapfrogs; many operate on generic capability surface. They do not reach the non-generic categories because the non-generic surface is, by definition, not addressable by generic solutions.

How to test a 2026 build decision against the 2027 forecast

Three diagnostic questions per build decision:

Question 1: is the capability generic or non-generic? A capability is generic if it is approximately the same shape across many organizations (RAG pipeline, agent orchestrator, eval runner, prompt registry). A capability is non-generic if its shape is determined by the org’s specific data, eval set, customers, or workflows. Generic capabilities will commoditize; non-generic capabilities will not.

Question 2: is the build justified by current frontier limitations? If the build is “we are building X because frontier model Y cannot do Z reliably enough,” the build has a 12-to-18-month obsolescence horizon, because Z is on the frontier-leapfrog roadmap. The build is correct in the short term but should be planned for replacement.

Question 3: would a buy product at 80 percent feature parity replace the build? If yes, the build is at risk because by 2027 there will be a buy product at 80+ percent feature parity. If no; because the build’s value comes from non-generic assets the buy product cannot replicate; the build is structurally durable.

Run the three questions per build decision. Builds that are generic, frontier-limitation-driven, and replaceable by an 80-percent buy product are in the erased 60. Builds that are non-generic, independent of frontier limitations, and not replaceable by a buy product are in the surviving 40.

What to do with the build decisions that fail the test

Build decisions in the erased 60 should not necessarily be killed today. They are correct in the short term; the buy alternative does not yet exist at the quality bar the org needs, or the buy alternative exists but switching cost makes it uneconomical. The right move is to manage the timeline.

Three actions:

Action 1: stop investing in long-term hardening. A capability that will be replaced by a buy alternative in 18 months should not be receiving long-term hardening investment. Code quality, architectural elegance, performance optimization beyond the production threshold; these are sunk costs at the obsolescence horizon. The team’s investment should match the lifetime.

Action 2: design for replaceability. The capability should be encapsulated behind an interface that allows a future buy alternative to substitute. The interface design is the safety mechanism; when the buy alternative arrives, the migration is an interface adapter rather than a re-architecture.

Action 3: redirect engineering capacity to the surviving 40. The capacity recovered from action 1 should be redirected to the non-generic capabilities that survive 2027. The eval set deserves disproportionate investment. The proprietary-data pipelines deserve disproportionate investment. The customer-specific integrations deserve disproportionate investment. The reallocation is the strategic move that makes the org structurally more defensible by 2027 rather than less.

The detail on how to identify the surviving capabilities is in the AI moat audit; capabilities scoring 7+ on the audit are the ones that align with the surviving 40.

Frequently asked questions

What if the forecast is wrong and commoditization slows?

The forecast can be partially wrong without the strategic implication changing. If the 60-percent number is 40 percent, the action is the same; redirect capacity from the most commoditization-vulnerable categories to the moat-positive categories. The strategic move is robust to the magnitude of the prediction; only the timing changes.

Should we accelerate the buy migration today?

Generally no. Premature migration produces many the cost of the migration and none of the buy-alternative quality (because the buy alternative is not yet good enough for production). The right pacing is: build with replaceability in mind today, monitor buy-alternative quality quarterly, migrate when the buy alternative crosses the quality bar. For most categories that will be in 2027 to 2028; some will be earlier.

How do we tell our engineering team that 60 percent of their work will be obsolete?

Frame it as the strategic clarification it is. The 60 percent is infrastructure that the team should be proud to have shipped, and that should be retired when the buy alternative is good enough. The team’s career value is in the 40 percent that survives; proprietary data work, eval discipline, critical-path UX, customer integration. The reallocation is an upgrade for the team’s portfolio, not a demotion.

What about regulated industries where the buy alternatives won’t catch up?

Regulated industries shift the commoditization timeline rather than negating it. The buy alternatives in regulated industries are 12 to 18 months behind unregulated alternatives because compliance certifications take time. The forecast for regulated industries is the same shape, shifted by 12 to 18 months; 60 percent erasure by 2028 rather than 2027. The strategic action is the same.

Does this prediction apply to non-AI capabilities?

The structural pattern (commoditization erodes generic capability, non-generic survives) is general. The AI-specific element is the speed; three forces operating in parallel produce a faster commoditization than is typical for software categories. Non-AI software commoditization happens on a 5-to-10-year horizon; AI commoditization is happening on an 18-to-24-month horizon.

How does this interact with the build-vs-buy-vs-hire matrix?

The matrix says re-litigate most decision quarterly. The forecast is the forward-looking version of the re-litigation; instead of revisiting decisions made in 2024 against 2026 conditions, the forecast lets orgs make 2026 decisions against expected 2027 conditions. The matrix and the forecast are complementary practices on the same underlying discipline.

What if our org has few capabilities in the 40 percent?

That is a strategic problem and the forecast surfaces it. An org whose AI portfolio is mostly the erased 60 percent has been investing in infrastructure rather than in moat. The corrective is to identify proprietary-data opportunities, build eval discipline, deepen critical-path UX, and develop customer-specific integrations; explicitly, with calendared milestones. The detail is in the AI moat audit action plan.

How will the commoditization affect AI agency engagements?

Agency engagements that are heavily focused on the erased 60 percent will see margin compression and contract pressure as customers can buy the equivalent at lower cost. Agencies that have moved their engagement focus to the surviving 40 percent (data work, eval discipline, customer-specific integration) will see expansion. The detail is in the AI hybrid playbook and in the case for boutique AI agencies in the era of LLM commoditization.

Will frontier model leapfrogs eventually slow?

Probably yes, on a timeline beyond this forecast. The 6-to-9-month leapfrog cadence has been remarkably stable through 2024 to 2026, but the underlying technical trajectory will eventually flatten. The current forecast covers 2027; whether the same forces operate at the same rate through 2030 is a separate question.

How often should we update the forecast?

Annually. The three forces have stable enough trajectories that quarterly updates would over-react. The annual cadence catches structural changes in the forces (e.g., a frontier model that breaks the 6-to-9-month leapfrog rhythm in either direction) without producing whiplash on quarterly decision-making.

Key takeaways

Three forces; token-price decay, agent-template proliferation, frontier-model leapfrogs; will commoditize approximately 60 percent of 2026 AI build decisions by end of 2027. The 60 percent is predictable: boring infrastructure, cost-reduction infrastructure, common-workflow agents, frontier-compensation infrastructure. Many of it will be available as competitive buy products at quality bars the in-house versions cannot match.

The 40 percent that survives is also predictable: proprietary-data assets, domain-specific eval sets, critical-path UX, customer-specific integrations. The surviving 40 shares one property; it is not generic. The three commoditization forces many operate on generic capability surface, so they do not reach the non-generic categories.

The diagnostic for a current build decision is three questions: is the capability generic or non-generic, is the build justified by current frontier limitations, would a buy product at 80 percent feature parity replace the build. Builds that are generic, frontier-driven, and 80-percent-replaceable are in the erased 60 and should be managed for the obsolescence horizon; stop long-term hardening, design for replaceability, redirect capacity to the surviving 40.

The strategic implication is reallocation. Most orgs will discover that their AI portfolio has too much in the erased 60 and not enough in the surviving 40. The corrective is calendared and explicit; invest disproportionately in the eval discipline, the proprietary-data pipelines, the critical-path UX, and the customer-specific integrations that will still be there in 2027. The orgs that make the reallocation now will be structurally more defensible by 2027; the orgs that don’t will spend 2027 catching up while paying off the obsolete 60.

Last Updated: Jun 17, 2026

AW

Arthur Wandzel

SFAI Labs helps companies build AI-powered products that work. We focus on practical solutions, not hype.

See how companies like yours are using AI

  • AI strategy aligned to business outcomes
  • From proof-of-concept to production in weeks
  • Trusted by enterprise teams across industries
Get in Touch →
No commitment · Free consultation

Related articles