Consumption pricing is the structurally correct pricing model for AI products, and most B2B SaaS vendors will need to migrate from per-seat to consumption pricing between 2026 and 2028. The migration is mechanical if done right and catastrophic if done wrong. The mechanical version has six components: pick a metering unit that maps to both cost-to-serve and value-delivered, set a commit-plus-overage structure for enterprise customers, set a fair-use cap for self-serve customers, build the metering instrumentation, write the customer-communication migration playbook, and run the migration on a 90-to-180 day window with grandfathering for the heaviest existing customers. The catastrophic version skips one of the six components and produces 20 to 40 percent customer churn during the migration window. This piece is the playbook; the specific decisions, defaults, and migration sequence that work in 2026 B2B SaaS markets. Run it end-to-end and the migration produces stable revenue, expanded heavy-user economics, and a pricing model that survives token-price decay and capability commoditization for the next decade.
This is a spoke under the AI project economics manifesto. The manifesto names evaluation cost as the central economics framework on the project side; consumption pricing is the corresponding framework on the revenue side; what the vendor charges when the product is in production and operating at variable cost-to-serve.
The six components of a consumption-pricing migration
A complete migration has six work streams, each owned by a different function:
- Metering unit selection (Product + Finance, weeks 1-3).
- Commit-plus-overage structure design (Finance + Sales, weeks 2-5).
- Fair-use cap structure for self-serve (Product + Marketing, weeks 3-6).
- Metering instrumentation build (Engineering, weeks 1-10).
- Customer migration playbook (Customer Success, weeks 4-8).
- Migration execution (cross-functional, weeks 8-26).
Skipping any one of the six produces predictable failure modes. Skip the metering unit decision and the eventual unit will be the wrong one. Skip the commit-plus-overage and enterprise procurement will block. Skip fair-use and self-serve becomes a money-loser. Skip the instrumentation build and you ship pricing without the ability to bill it. Skip the customer playbook and migration churn spikes. Skip the migration window discipline and the pricing change drags on for 18 months.
Picking the metering unit
The metering unit is the single most important decision in the migration. Three valid candidates for B2B AI products in 2026:
- Tokens. Most AI vendors default to this because it maps directly to inference cost. Strength: clean cost-to-serve mapping. Weakness: customers struggle to budget against tokens because 1M tokens means different things for different use cases. Best for: developer-facing API products, infrastructure-layer AI.
- Calls or actions. A unit-of-action that maps to a customer-facing concept (a generated email, a resolved ticket, a research report, a document summarized). Strength: customers understand what they’re buying. Weakness: cost-to-serve varies by action complexity; vendors must price actions in tiers. Best for: vertical AI applications, customer-facing AI products.
- Credits. A vendor-defined unit that wraps multiple metering primitives (1 credit = 1K tokens for chat, 1 credit = 5 actions for agents, 1 credit = 1 generated document, etc). Strength: flexible across product surface. Weakness: opacity; customers don’t fully understand the conversion rates. Best for: multi-product AI platforms.
The pattern across successful B2B AI consumption pricing in 2025-2026 is to pick a single metering unit that’s visible to customers (calls, actions, or credits) and price it at a stable per-unit rate that includes inference cost plus 4 to 7x markup for non-inference value (support, eval discipline, observability, integration). The cost-per-action framework covers the unit economics in detail.
The classic mistake is metering on tokens when customers think in actions. Tokens are an internal cost-of-goods unit; actions are the customer-facing pricing unit. Vendors that meter on tokens externally produce confused buyers and unpredictable bills.
The commit-plus-overage structure
Enterprise customers need predictable annual budgets. Pure usage-based pricing fails procurement. Commit-plus-overage solves this:
- The commit. Customer commits to N units per year at a discounted rate (typically 25 to 40 percent off list price). Commit is paid upfront or quarterly.
- The overage rate. Usage above the commit is billed at the list rate (or a slight discount, typically 10 to 20 percent off list).
- The roll-over policy. Unused commit at year-end either rolls over (1 quarter forward, typically) or is forfeited. Forfeiture is more vendor-friendly; rollover is more customer-friendly. Mid-market customers prefer rollover; large enterprise prefers commit-and-true-up structures.
Default structure that works for most B2B AI products in 2026:
| Tier | Commit (units/year) | Discount on commit | Overage rate | Rollover |
|---|---|---|---|---|
| Mid-market | 100K | 25% | List | 1 quarter |
| Enterprise | 1M | 35% | 10% off list | 1 quarter |
| Strategic | 10M+ | 40% | 15% off list | Custom |
The commit-plus-overage structure gives the vendor 60 to 80 percent revenue predictability through commits and 20 to 40 percent elastic upside through overage; a much better revenue shape than pure per-seat or pure usage. The structure also gives procurement the predictable annual budget they need while preserving the vendor’s exposure to the heavy-user upside that broke per-seat pricing.
The fair-use cap for self-serve
Self-serve customers don’t want to think about consumption. They want a flat monthly price and predictable bills. Fair-use cap is how to give them that without losing money on heavy users:
- A flat monthly price that covers usage up to a cap (e.g., $30/month for up to 50K credits/month).
- A fair-use cap above which usage-based billing kicks in at list rate.
- A clear notification flow when customers approach the cap (typically alerts at 80 percent and 100 percent of cap).
The cap should be set at the 90th percentile of self-serve usage, not the median. Setting at the median means the vendor loses money on the top 10 percent of self-serve customers (which is where AI usage variance bites hardest). Setting at the 90th percentile means the top 10 percent pay overage but most customers experience the product as flat-priced.
The pricing-models-by-alignment piece shows where fair-use-cap pricing ranks against alternative AI engagement structures on outcome alignment.
A useful sanity check on cap design: the fair-use cap should be at least 5x the median usage on the self-serve plan. If the cap is below 3x median usage, normal users hit the cap and produce surprise bills, which spike support volume and churn. If the cap is above 10x median usage, the vendor leaves money on the table from heavy self-serve users.
Building the metering instrumentation
Engineering work for consumption pricing has three layers:
- Per-call metering. Most AI call (or other metering unit) is logged to a metering store with timestamp, customer ID, unit count, and metadata. This is the source of truth for billing.
- Aggregation pipeline. Hourly or daily aggregation of metering events into per-customer-per-period usage rolls. Aggregation must be exactly-once; double-counting bills produces customer trust crises.
- Billing system integration. The aggregation feeds into the billing system (Stripe Billing, Chargebee, Maxio, or a custom integration with the ERP). Invoices are generated against the rolls.
Most vendors underestimate the metering build. A clean metering implementation is typically 6 to 10 engineering-weeks for a single product. Multi-product vendors building a unified metering platform are looking at 4 to 8 engineering-months. The complexity sits in the exactly-once aggregation and the audit trail required for enterprise customers who want to verify their bills.
A pattern that works: use OpenMeter, Lago, or Orb as the metering layer rather than building from scratch. These products handle the aggregation correctness and audit trail and integrate cleanly with billing systems. Build-from-scratch is reasonable only for vendors with extreme metering volume (10M+ events per day) or unusual aggregation requirements.
The customer migration playbook
Migrating existing per-seat customers to consumption pricing is the hardest part of the transition. Three customer segments need different playbooks:
- Heavy users (top 10 percent). Currently paying flat per-seat for usage that costs the vendor more than the seat price. Migration to consumption pricing typically increases their bill 50 to 200 percent. Strategy: grandfathered transition over 12 to 18 months, with the consumption price introduced as a parallel track and the old per-seat phased out gradually. Some heavy users will churn; the vendor must accept this because holding them at unprofitable per-seat is structurally untenable.
- Median users (middle 80 percent). Currently paying per-seat for usage that’s roughly break-even or moderately profitable. Migration to consumption pricing should keep their bill roughly flat. Strategy: introduce consumption pricing as an opt-in alternative with a “your bill won’t increase” guarantee for the first 12 months, then default new contracts to consumption.
- Light users (bottom 10 percent). Currently paying per-seat for usage that’s highly profitable for the vendor. Migration should keep them on a flat plan (the fair-use cap structure) with the price unchanged or slightly reduced. Strategy: migrate to the self-serve fair-use plan automatically; communicate as “lower price, same product.”
The migration playbook works because it minimizes the surface where customers experience price increases. Heavy users see the increase but get advance notice and a transition runway. Median users see no change. Light users see neutral or slightly favorable changes.
The 90-to-180 day migration window
The migration should happen on a defined window, not as an open-ended project. The recommended sequence:
- Days 1-30: Internal preparation. Finalize metering unit, commit structure, fair-use caps. Build metering instrumentation. Write the customer playbook.
- Days 31-60: Sales and CS enablement. Train sales on the new pricing structure. Train CS on the migration conversation playbook. Build internal pricing tools.
- Days 61-90: Beta migration. Migrate 10 to 20 friendly customers across the three segments. Surface objections, refine the playbook, validate the metering instrumentation.
- Days 91-150: Broad migration. Migrate the remaining customer base segment by segment. Light users first (lowest risk), then median users, then heavy users with grandfathering.
- Days 151-180: Cleanup and policy lock. Migrate the final long-tail customers. Decommission per-seat as the default for new contracts. Lock the new pricing policy.
A 180-day migration window with this discipline typically produces 5 to 10 percent customer churn (mostly heavy users on the unprofitable end) and a 30 to 60 percent revenue uplift on the migrated cohort. An open-ended migration drags on for 18 to 24 months, accumulates churn from confused customers, and rarely fully retires the per-seat plan.
The anatomy-of-a-runaway-AI-project piece covers a related dynamic; projects without a forcing function tend to drift indefinitely. Pricing migrations have the same property; the migration window is the forcing function.
Frequently asked questions
What is consumption pricing for AI products?
Consumption pricing bills customers based on usage of a metered unit (tokens, calls, actions, or credits) rather than per-seat or flat-rate. For B2B SaaS AI products in 2026, the structurally correct model is consumption pricing with three layers: a fair-use cap for self-serve customers, commit-plus-overage for mid-market, and substantial commits with custom overage rates for enterprise. The seat construct survives as a governance overlay but is no longer the pricing primitive.
What metering unit should I pick?
Pick a unit that’s visible to customers (calls, actions, or credits) rather than tokens. Customers struggle to budget against tokens because 1M tokens means different things for different use cases. Tokens are an internal cost-of-goods unit; actions or credits are the customer-facing pricing unit. Price the customer-facing unit at 4 to 7x cost-to-serve to fund non-inference value (support, eval discipline, observability, integration).
How do I structure commit-plus-overage for enterprise customers?
Customer commits to N units per year at a 25 to 40 percent discount on list price. Usage above commit is billed at list rate or a slight discount (10 to 20 percent off list). Unused commit either rolls over one quarter or is forfeited. Default structure for B2B AI: mid-market 100K units/year at 25 percent off, enterprise 1M units/year at 35 percent off, strategic 10M+ units at 40 percent off with custom overage.
Where should I set the fair-use cap on self-serve?
At the 90th percentile of self-serve usage, not the median. Setting at the median loses money on the top 10 percent of self-serve customers. Setting at 90th percentile means most customers experience the product as flat-priced while the heavy 10 percent pay overage. The cap should be at least 5x median usage on the plan; below 3x produces surprise bills, above 10x leaves money on the table.
How long does the metering build take?
A clean metering implementation is typically 6 to 10 engineering-weeks for a single product. Multi-product vendors building a unified metering platform are looking at 4 to 8 engineering-months. The complexity sits in exactly-once aggregation and audit trail. Use OpenMeter, Lago, or Orb as the metering layer rather than building from scratch unless you have extreme volume or unusual aggregation requirements.
How do I migrate existing per-seat customers without churn?
Three-segment strategy. Heavy users (top 10 percent, currently underpaying) get a grandfathered transition over 12 to 18 months with the new pricing introduced as a parallel track. Median users (middle 80 percent) get an opt-in migration with a year of bill-stability guarantee. Light users (bottom 10 percent, currently overpaying) get migrated to the self-serve fair-use plan automatically with a flat or slightly lower price. This minimizes the surface where customers experience increases.
What’s the typical migration window?
90 to 180 days. Days 1-30: internal preparation. Days 31-60: sales and CS enablement. Days 61-90: beta migration with 10 to 20 friendly customers. Days 91-150: broad migration by segment. Days 151-180: cleanup and policy lock. Open-ended migrations drag on for 18 to 24 months and rarely fully retire per-seat. The window is the forcing function.
What revenue impact should I expect?
Typically 30 to 60 percent revenue uplift on the migrated customer base, with 5 to 10 percent customer churn (mostly heavy users on the unprofitable end whose per-seat pricing was structurally untenable). The uplift comes from heavy users now paying their full cost-to-serve plus margin, and from new heavy users that pure per-seat pricing was repelling.
Does consumption pricing work for vertical AI products?
Yes, often better than for horizontal AI products. Vertical AI products typically have a clearer customer-facing unit (resolved ticket, processed document, generated report) that maps cleanly to value-delivered. The metering unit decision is easier and the commit structure is easier to negotiate because the unit corresponds to a business outcome the customer already measures.
How does consumption pricing interact with the AI cost-of-quality framework?
Tightly. Cost-of-quality tells the vendor how much to spend on prevention (eval discipline, observability) to avoid external failure. Consumption pricing distributes that cost across customers proportional to their usage. Heavy users; who drive most of the eval and observability cost through their volume; pay proportionally for that infrastructure rather than free-riding on per-seat pricing where light users subsidize them. The two frameworks are mutually reinforcing.
Key takeaways
- Consumption pricing migration has six components: metering unit selection, commit-plus-overage design, fair-use cap design, metering instrumentation build, customer migration playbook, and migration execution. Skipping any one produces predictable failure modes.
- Pick a customer-facing metering unit (calls, actions, or credits) rather than an internal cost unit (tokens). Price at 4 to 7x cost-to-serve to fund non-inference value.
- Commit-plus-overage gives 60 to 80 percent revenue predictability through commits and 20 to 40 percent elastic upside through overage. Default mid-market 100K units/year at 25 percent off, enterprise 1M units at 35 percent off, strategic 10M+ at 40 percent off.
- Set the self-serve fair-use cap at the 90th percentile of usage. Below 3x median produces surprise bills; above 10x median leaves money on the table.
- Migration window of 90 to 180 days is the forcing function. Open-ended migrations drag for 18 to 24 months and rarely retire per-seat. Expect 30 to 60 percent revenue uplift with 5 to 10 percent customer churn.
The consumption pricing playbook is mechanical if executed end-to-end and catastrophic if executed in pieces. The vendor that runs the full playbook in a defined 180-day window comes out the other side with a pricing model that survives token-price decay, capability commoditization, and the next decade of AI substrate change. The vendor that runs it piecemeal stays trapped in per-seat for another two years and absorbs the margin compression in the meantime.
Arthur Wandzel