📑 Table of Contents

Large Language Models Have a Shorter 'Shelf Life' Than Milk

📅 · 📁 Opinion · 👁 11 views · ⏱️ 6 min read
💡 The breakneck iteration speed of large AI models has sent pricing on a roller-coaster ride. From sky-high premiums to rock-bottom bargains, large models are undergoing an unprecedented revaluation as the industry landscape reshuffles at an accelerating pace.

Introduction: A Carton of Milk May Outlast a Large Language Model

In early 2024, a leading large model's API was priced at 120 yuan per million tokens. Less than six months later, models of comparable capability had dropped to under 1 yuan. A carton of milk typically has a shelf life of seven days to six months, yet a large model that once "wowed the world" can go from launch to market oblivion in less than three months.

The astonishingly short shelf life of large models is upending the commercial logic of the entire AI industry.

The Pricing Roller Coaster: From 'Worth Its Weight in Gold' to 'Dirt Cheap'

Looking back over the past two years, the pricing trajectory of large models has been nothing short of surreal.

In early 2023, OpenAI's GPT-4 API pricing made many developers balk — 30 dollars per million input tokens. At the time, models of comparable capability were few and far between, and pricing power rested firmly in the hands of a small number of players. Yet barely a year later, an all-out price war erupted:

  • In China, ByteDance's Doubao model led the charge by slashing prices to the "milli-yuan" level, with rates as low as 0.8 yuan per million tokens. Alibaba's Tongyi Qianwen and Baidu's ERNIE Bot quickly followed with price cuts and even free quotas. DeepSeek further disrupted the market with extreme cost-effectiveness.
  • Overseas, Google's Gemini Flash series dramatically undercut costs, and OpenAI was compelled to launch GPT-4o mini to counter the price competition, with pricing plummeting over 90% compared to GPT-4.

One AI entrepreneur lamented: "The model we chose last quarter for its great value-for-money already feels like a rip-off this quarter."

Why Do Large Models 'Expire' So Quickly?

1. Moore's Law — The AI Turbo Edition

The traditional chip industry follows Moore's Law, with performance doubling every 18 months. The iteration speed in the large model space far exceeds that pace. Algorithm optimization, advances in training techniques, and momentum from the open-source community mean that stronger models can be trained with the same compute. Today's "flagship" could be merely "entry-level" three months from now.

2. The 'Dimensional Reduction Strike' of Open Source

Open-source models such as Meta's LLaMA series, Alibaba's Qwen series, and DeepSeek continue to iterate, constantly raising the capability baseline available for free. When open-source models approach or match the previous generation of closed-source flagships, the latter's commercial value rapidly drops to zero. This dynamic of "open source chasing closed source" dramatically compresses each model generation's commercial lifecycle.

3. Intensifying Homogeneous Competition

The gap in general-purpose capability among mainstream large models is narrowing. They jockey for position on benchmarks, but the perceived difference for end users is limited. When products become highly homogeneous, price becomes the most direct competitive weapon — hence the successive rounds of "price stampedes."

4. Continuously Falling Customer Switching Costs

Standardized API interfaces and increasingly mature model-routing middleware allow developers to switch between models almost seamlessly. Customer loyalty is extremely low — they simply go with whoever is cheapest — further accelerating model "expiration."

Survival Strategies Under 'Shelf-Life Anxiety'

Facing such a brutal iteration cadence, industry participants are seeking their own ways to "stay fresh":

For model providers:

  • Compete on speed: Shorten R&D cycles to maintain a capability lead, even if that window may last only a few weeks.
  • Compete on ecosystem: Build toolchains, plugin marketplaces, and fine-tuning platforms around the model, using ecosystem stickiness to counter rapid model depreciation.
  • Compete on use cases: Move from general-purpose to vertical domains — healthcare, legal, finance — and establish data moats and industry know-how that cannot be easily displaced by the "next-generation general model."

For developers and enterprise users:

  • Don't bet on a single model: Adopt multi-model architectures and model-routing strategies so you can switch at any time.
  • Focus on total cost of ownership (TCO), not unit price: Model pricing is just the tip of the iceberg; engineering integration, data preparation, and operational monitoring costs are equally critical.
  • Embrace open source: Where data security and performance allow, prioritize open-source solutions to reduce dependence on any single commercial model.

Outlook: When the 'Shelf Life' Approaches Zero

The rapid depreciation of large models will not reverse in the short term. As training costs continue to fall and open-source capabilities keep improving, models themselves are transitioning from "core assets" to "infrastructure" — and even to "consumables."

What holds lasting value is no longer a specific model weights file, but the data flywheels, application scenarios, user habits, and industry solutions built around models. Just as no one in the cloud computing era cares which CPU runs under the hood, in the AI application era users will ultimately stop caring which model is running behind the scenes.

The shelf life of large models is getting shorter, but the shelf life of AI-generated value is just beginning. The key question is: are you selling "milk," or are you running the "dairy farm"?