Alibaba Launches Surprise Late-Night Price Cut: Large Model Price War Enters a New Phase

📅 2026-05-01 · 📁 Industry · 👁 10 views · ⏱️ 7 min read

💡 Alibaba Cloud's Tongyi large model announced a dramatic price reduction late at night, slashing cached token pricing to 1 RMB per million tokens — signaling that the competitive logic in the large model industry is shifting entirely from 'competing on performance' to 'competing on cost.'

Late-Night Blitz: Alibaba Reignites the Price War

As competition in the AI large model arena reaches a fever pitch, Alibaba Cloud has once again wielded its "price-slashing blade." Late one recent night, Alibaba Cloud quietly updated the pricing page for its Tongyi large model API, dramatically cutting cached token prices to just 1 RMB per million tokens. This price point has effectively shattered the industry's previous cost floor, once again disrupting an already turbulent large model market.

The decision to release pricing updates late at night reflects both Alibaba's trademark "play it low-key" strategy and the breakneck pace of competition in the large model industry — no one wants to give rivals time to react. The news quickly sparked widespread discussion across developer communities and industry circles.

What Does 1 RMB Per Million Tokens Really Mean?

To appreciate the impact of this price cut, it's important to first understand the concept of "cached tokens." In large model API calls, caching mechanisms allow the system to reuse the context of high-frequency requests, dramatically reducing redundant computation. For enterprise clients, many business scenarios — such as customer service conversations, document Q&A, and code completion — involve highly similar prompt prefixes, with cache hit rates often exceeding 60%.

Cutting cached token prices to 1 RMB per million tokens means that enterprises' actual costs in high-frequency usage scenarios will be significantly compressed. Consider a mid-sized customer service system handling 100,000 dialogue requests per day: assuming an average of 2,000 tokens per request and a 70% cache hit rate, the daily cost for cached tokens alone could drop to less than 100 RMB. Just a year ago, such a figure would have been almost unimaginable.

A Fundamental Shift in Competitive Logic

Looking back at the large model race over the past year or so, industry competition has gone through three distinct phases:

Phase One: Competing on parameters and benchmarks. Vendors raced to release models with hundreds of billions of parameters, jockeying for position on various benchmarks, with technical specifications as the core selling point.

Phase Two: Competing on ecosystems and real-world deployment. As model capabilities began to converge, the competitive focus shifted to who could embed large models into real business scenarios fastest, with developer ecosystems and application case studies becoming key differentiators.

Phase Three: Competing on cost and efficiency. This is the phase the industry is currently experiencing. As technical gaps narrow and application scenarios become clearer, "affordability" has become the core bottleneck determining whether large models can truly achieve mass adoption. Alibaba's late-night price adjustment is a landmark event marking this shift in logic.

This evolution in competitive logic is not unique to the Chinese market. Overseas, giants like OpenAI, Google, and Anthropic are also continuously optimizing their pricing strategies and launching more cost-effective model versions. However, the intensity of the price war in the Chinese market is clearly higher — from last year's "Battle of a Hundred Models" to today's "pricing bloodbath," the brutality of this elimination round has far exceeded expectations.

Who Benefits? Who Feels the Pressure?

The beneficiaries are obvious. Small and medium-sized enterprises and independent developers stand to be the biggest winners. Startup teams that previously hesitated due to API call costs can now validate products and scale deployments at extremely low prices. The barrier to entry for AI-native application development has been further lowered, potentially catalyzing a new wave of application-layer innovation.

The pressure falls on smaller model vendors. For players lacking cloud computing infrastructure support and unable to spread costs through economies of scale, a price war of this magnitude essentially means "you can't keep up even if you're willing to lose money." Giants like Alibaba, ByteDance, and Baidu can cross-subsidize through their massive cloud business ecosystems, while independent large model companies face severe existential challenges.

Notably, lower prices do not mean lower quality. Alibaba's price adjustments are primarily focused on optimizations at the caching level — essentially leveraging engineering capabilities and infrastructure advantages to compress marginal costs, rather than sacrificing model performance. This also means that future competition will not only be about model R&D capabilities, but also about comprehensive strengths in systems engineering, inference optimization, and resource scheduling.

Outlook: Large Models Heading Toward the 'Utility' Era

Alibaba's late-night price cut sends a clear signal: large models are accelerating their transformation into fundamental infrastructure. Just as cloud computing evolved from a "luxury" to a basic "utility" — like water, electricity, and gas — over the past decade, large model API pricing will continue to decline, eventually becoming a standard component of enterprise digital transformation.

In the short term, the industry price war is likely to continue, with ByteDance, Baidu, Tencent, and other players highly likely to follow suit with price adjustments in the near future. In the medium term, when price is no longer a differentiating factor, competition will return to deeper dimensions such as model quality, service reliability, and ecosystem maturity. In the long term, the ultimate winners will be those who can maintain the pace of technological iteration and business model sustainability while delivering rock-bottom prices.

For the AI industry as a whole, this is undoubtedly a positive signal — when the cost of usage is no longer a barrier, the possibilities for innovation are truly unleashed.

📌 Source: GogoAI News (www.gogoai.xin)

🔗 Original: https://www.gogoai.xin/article/alibaba-cloud-late-night-price-cut-large-model-price-war-new-phase

⚠️ Please credit GogoAI when republishing.

🌐 Explore More from GogoAI

🛠️ AI Tools Directory

Discover 100+ curated AI tools for every workflow

ChatGPT Claude Midjourney Copilot

Browse All Tools →

📚 AI Tutorials

Step-by-step guides from beginner to advanced

Prompts AI Coding Basics Projects

Start Learning →