Feature-by-feature estimation produces a 2018-software project plan with a confidence interval ±10 percent. The same template applied to AI work in 2026 produces a confidence interval ±60 percent, because AI features are non-deterministic and inference cost varies by orders of magnitude across user behavior. The replacement is not “estimate harder.” The replacement is to switch units; from feature-list budgets to unit-economics budgets where cost-per-action × expected volume is the primary line item. This piece argues that finance teams should make the switch and engineering teams should help them.
The argument is the budgeting consequence of the AI project economics manifesto’s principle that evaluation is the unit of account. Manifesto-level principles need budgeting templates. Feature-list budgets are the wrong template. Unit-economics budgets are the right one.
Why feature-list estimation breaks for AI
Feature-list estimation works on three assumptions that many hold for CRUD software and many break for AI software.
Assumption 1: cost is roughly linear in features. A six-feature CRUD project costs roughly twice a three-feature one because each feature consumes a roughly comparable engineering budget. AI features violate this. Two features in the same project; say, “summarize this document” and “answer questions about this document”; can have a 50× cost ratio depending on retrieval design, eval bar, and output verbosity. A feature list with no per-feature cost decomposition is averaging away an order of magnitude of variance.
Assumption 2: cost is roughly stable in volume. A CRUD feature costs the same to operate at 1,000 daily users as at 100,000 daily users; the database scales, the cost line stays roughly flat. An AI feature serving 100,000 daily users costs 100× more than at 1,000, because inference is per-call, not per-deployment. Feature-list estimation that ignores volume is estimating a constant where the real number scales linearly with usage.
Assumption 3: cost is roughly stable in time. A CRUD feature shipped in Q1 costs roughly the same to operate in Q3 because the underlying tech does not change. An AI feature shipped on a 2026 model migrates to a different model in Q3; eval re-runs, prompt revisions, threshold relocking; and the cost surface changes meaningfully. Feature-list estimation that does not name model-migration cost under-budgets the project by 15 to 25 percent.
The combined effect of three broken assumptions is that AI project budgets done by feature list under-estimate by a factor that compounds. Feature variance under-estimates by 30 to 40 percent. Volume under-estimates by 50 to 200 percent. Model-migration under-estimates by 15 to 25 percent. The aggregate produces the recurring failure mode: budget approves, project ships, month nine spending is 2× the original estimate, and the conversation with finance becomes a fight over scope rather than a fix to the unit of account.
What unit economics replaces it with
The unit-economics frame has one primary line item per action: cost-per-action × expected volume. Everything else is auxiliary. The structure looks like this.
For each user-facing action the system performs, the budget line is expected_volume × cost_per_action × engagement_duration. Cost-per-action is decomposed into the six standardized lines (input tokens, output tokens, amortized embedding, retrieval, amortized eval, amortized observability); see the AI cost-per-action framework: a unit-economics model that survives model upgrades for the unit definition. Expected volume is named explicitly with a low/expected/high range. Engagement duration is the contract horizon.
A second line names the eval-engineering cost as a percentage of the unit-economics line, typically 30 to 40 percent; see the hidden cost of AI evals: where 35 percent of project budget goes for the empirical anchor.
A third line names model-migration reserve, typically 10 to 15 percent of the unit-economics line, sized to three to five model migrations per year at two to four engineering weeks each.
A fourth line names the retainer for ongoing operations, sized to the regression rate; typically 8 to 12 percent of the unit-economics line.
The four lines are the budget. There is no “feature 1: $80K, feature 2: $120K, feature 3: $60K.” The budget is structured around the units that drive cost, not around the features being shipped. Features are still scoped, designed, and reviewed; they just are not the unit of account in the budget.
The shift is structural and uncomfortable. A finance team used to defending feature-line budgets to a CFO has to learn to defend a four-line budget where the largest line is “cost-per-action × volume” and the second largest is “eval engineering.” That conversation is harder the first time and much easier the second time, because the conversation is about real cost drivers rather than guessed feature estimates.
Why finance teams find unit economics more defensible
The counterintuitive finding from teams that have made this switch: unit-economics budgets are easier to defend to finance, not harder. Three reasons.
Each line is auditable. Cost-per-action × volume produces a number that can be checked against the production cost dashboard at any point. If the budget said $0.012 per action × 5M actions/month and the dashboard says $0.014 × 4.6M, the variance is mechanical to investigate; model migration affected line 1, lower volume affected line 2; and the conversation is about specific levers rather than vague slippage. Feature-list budgets do not have this property; “feature 3 is 20 percent over” is not investigable in the same way because the unit is squishy.
Volume assumptions are explicit. A unit-economics budget forces the buyer to write down the volume the system is expected to serve. That number is then a contract with reality. If the actual volume is 3× the expected, finance can see immediately that the inference line will be 3× larger and the conversation is about pricing or capacity, not budget overrun. Feature-list budgets bury volume inside the feature estimates, where it cannot be questioned and gets quietly wrong.
Model-migration reserve is named. Naming the reserve forces the conversation about model migrations as planned events rather than unplanned scope creep. When a migration consumes the reserve, the engagement is still on plan. When a migration consumes more than the reserve, the engagement renegotiates against an explicit baseline. Feature-list budgets put model-migration cost into the void where surprise lives.
The corollary: a CFO who has approved a unit-economics budget can answer board questions about AI spend in fundamentally honest terms. “Inference is 65 percent of project cost; eval engineering is 30 percent; model migration is 10 percent.” Compare with the CFO answering from a feature-list budget, who can say only “the AI project budget was $1.4M and we spent $1.7M, the team says it’s because of model migrations and unexpected scope.” The first conversation is a defensible operating story; the second is an unforced loss of credibility.
The migration: from feature list to unit economics
The migration from feature-list budgeting to unit-economics budgeting takes one or two SOW cycles. The mechanics.
SOW N: feature list with unit-economics annotations. Keep the feature list; finance does not change templates overnight; but annotate each feature with the actions it provides, the expected volume per action, and the cost-per-action target. The annotations build the data backbone for the next SOW without forcing a template fight.
SOW N+1: unit economics primary, feature list auxiliary. Lead the budget with the four-line unit-economics structure. Keep the feature list as a deliverables appendix, not as the primary cost decomposition. The shift is now visible to finance and procurement; the contract milestones are tied to eval thresholds and unit-economics outcomes, not feature acceptances.
SOW N+2: unit economics native. The feature list disappears from the budget entirely. Features are scoped in the design appendix. The budget is the four lines. The contract milestones are eval-threshold passes plus the unit-economics targets. This is where mature 2026 engagements have landed.
The migration requires one piece of new infrastructure on the buyer side: an action-level cost dashboard that gets populated during the engagement. Without the dashboard, the unit-economics numbers in the SOW have no way to be measured against reality, and the contract reverts to argument-by-anecdote on slippage. Standing up the dashboard is non-negotiable; it is the audit log of the unit-economics contract.
Worked example: a 12-month engagement
A buyer wants an AI sales assistant for 200 reps. Feature-list version: “Lead enrichment $150K. Email drafting $180K. Meeting scheduler $90K. Eval and observability $45K. Total $465K, ±20%.” Finance signs off. Month nine the project is at $720K and finance is unhappy.
Unit-economics version. Three actions: researched-lead, drafted-email, scheduled-meeting. Volumes: 800/400/80 per rep per month across 200 reps. Cost-per-action: $0.18, $0.09, $0.04. Inference over 12 months: 200 × (800×0.18 + 400×0.09 + 80×0.04) × 12 = $432K. Eval at 30 percent: $130K. Migration reserve at 12 percent: $52K. Retainer at 10 percent: $43K. Total $657K. The buyer can scope volume down or lock per-rep limits if the budget is too high.
The unit-economics number is higher than the feature-list number. Not a bug; closer to real cost. Finance prefers the painful real number because actual spend lands at $662K; within reserve, on plan, no surprise. The feature-list number was usually going to be wrong.
Objections and responses
“Finance won’t approve a budget without a feature list.” Finance approves whatever line items have a defensible methodology behind them. Unit economics is more defensible than feature lists for AI work; the conversation is about teaching finance the new template, not abandoning their oversight role. The action-level dashboard is the proof.
“We don’t know the expected volume yet.” Then the SOW writes a low/expected/high range and the engagement renegotiates against the actual volume after the first 90 days. A budget that pretends to know volume to one decimal place is more dishonest than a budget that names a range.
“What about features that are R&D, not actions?” Pure-R&D work where the deliverable is a learning rather than an action stays on time-and-materials with eval-threshold milestones, the model the case against fixed-price AI development contracts describes. Most AI engagements have an R&D phase and a production phase; unit economics applies to the production phase.
“Cost-per-action is too volatile to forecast.” Cost-per-action is volatile, which is exactly why it should be on the budget rather than buried inside features. Naming it explicit gives finance a lever to pull when it moves. Burying it inside features lets it move without finance noticing.
Frequently asked questions
Why does feature-list estimation break for AI work?
Three structural assumptions break: cost is not linear in features (50× cost ratios across features in the same project), cost is not stable in volume (inference is per-call), and cost is not stable in time (model migrations shift the cost surface most quarter). The aggregate effect is that feature-list budgets for AI work routinely under-estimate by 30 to 60 percent.
What is unit-economics budgeting in the AI context?
A budget structured around four lines: cost-per-action × expected volume × engagement duration; eval engineering as a percentage of that line; model-migration reserve; ongoing-operations retainer. Features are scoped in the design appendix but are not the unit of account in the budget itself.
How do I migrate from feature-list to unit-economics budgeting?
One or two SOW cycles. SOW N: feature list with unit-economics annotations. SOW N+1: unit economics primary, feature list auxiliary. SOW N+2: unit economics native. The buyer needs an action-level cost dashboard to make the unit-economics numbers measurable against reality.
Why is unit-economics budgeting more defensible to finance?
Each line is auditable against a production dashboard. Volume assumptions are explicit and contract with reality. Model-migration reserve is named rather than buried in scope creep. The CFO can answer board questions about AI spend in honest terms rather than narrating overruns.
What if I do not know expected volume yet?
Write a low/expected/high range and renegotiate against actual volume after 90 days. A budget pretending to know volume to one decimal place is more dishonest than one naming a range and a renegotiation clause.
Does this apply to features that are R&D rather than actions?
No. Pure R&D work where the deliverable is a learning rather than an action stays on time-and-materials with eval-threshold milestones. Unit economics applies to the production phase of an engagement, not the discovery phase.
Why is the unit-economics number usually higher than the feature-list number?
Because feature-list budgets routinely under-budget eval engineering, model migration, and inference at scale. The unit-economics number is closer to the real cost. Finance prefers the painful real number to the comfortable wrong one because the painful number does not produce a Q3 surprise.
How does this connect to the AI cost-per-action framework?
Cost-per-action is the unit; unit-economics budgeting is the budget template that uses the unit. The framework provides the operational primitive; this piece argues for the budgeting practice that primitive enables.
How does this connect to the AI project economics manifesto?
The manifesto names evaluation as the unit of account at the project level. Unit-economics budgeting is the way evaluation-cost-as-unit-of-account translates into a finance template the CFO can sign. Without this translation, manifesto principles stay aspirational rather than enforceable.
What is the first thing to do this quarter?
Pick one AI engagement currently in flight. Define its three to five actions. Compute cost-per-action against current spend. Compare with the original feature-list budget. The discrepancy is the cost of running on the wrong template; and the strongest argument for moving the next engagement to unit economics.
Key takeaways
- Feature-list estimation breaks for AI work on three structural assumptions: cost is not linear in features, not stable in volume, not stable in time. The aggregate under-estimate is typically 30 to 60 percent.
- Unit-economics budgeting replaces the feature list with four lines: cost-per-action × volume × duration; eval engineering; model-migration reserve; ongoing-operations retainer.
- Unit-economics budgets are easier to defend to finance than feature-list budgets because most line is auditable against a production dashboard, volume is explicit, and model-migration cost is named rather than buried.
- Migration takes one or two SOW cycles, runs on an action-level cost dashboard, and produces a budget the CFO can answer board questions from in honest terms.
- The unit-economics number is usually higher than the feature-list number. That is not a bug; it is the real cost surfacing instead of compounding into a Q3 surprise.
Arthur Wandzel