China Token Exports: Between Statistical Illusions and "Price Butchers"
The “Misalignment” Behind the Data
In February 2026, a weekly report from OpenRouter caused a stir in the Chinese AI community: among the platform’s top ten models, Chinese models consumed 61% of the tokens, and the top three (MiniMax M2.5, Kimi K2.5, and Zhipu GLM-5) were all from China.
Against the backdrop of China’s deep anxiety over a “chip ban” and a “computing power shortage,” this report card seems extremely surreal. Given the chip shortage, why are Chinese tokens still being “dumped” overseas? Some are even starting to wonder: could this lead to a computing power surplus in the United States?
We need to understand what this 61% actually means.
OpenRouter: Is it a “panoramic view” or a “laboratory”?
To understand this data, you must first understand what OpenRouter is.
Who is it?OpenRouter is the world’s largest AI API aggregation platform, a “model supermarket.” It is loved by individual developers, indie hackers, and AI startups.
Who is using it:
First, there are users of Agent tools. For example, developers using Cursor, Cline, or OpenClaw need to frequently call different models for code writing and automation tasks, making the one-stop Key provided by OpenRouter a necessity.
Secondly, there are cost-sensitive projects, those small AI applications that are still looking for product-market fit (PMF).
The truth:Despite its significant presence in the developer community, according to ramp, OpenRouter actually accounts for only about 2% of global AI spending.
The real traffic drivers—the Fortune 500 companies and large SaaS vendors (such as Salesforce and Microsoft) that consume over 90% of the world’s tokens—will never call models through OpenRouter. Instead, they directly connect to the official native APIs of OpenAI and Anthropic or use Azure/AWS hosting.
The conclusion is clear: China’s dominance is currently limited to the “innovation labs” of global AI applications, and has not yet penetrated enterprise-level “central data centers”.
The “overseas advantage” of Chinese tokens: price, price, and more price.
Why can Chinese models instantly dominate the charts in a market like OpenRouter where consumers “vote with their feet”?
Ultimate cost-effectiveness:While the input price of Claude Opus 4.6 was still hovering around 0.30. That’s a price difference of nearly 17 times.
“Fuel Fee” in the Agent Era:2026 marked the beginning of a boom for AI agents. Running an intelligent agent like OpenClaw could involve scanning millions of tokens in a single task. Using an American model, the daily testing fee could be tens of dollars; with a domestically developed model, it would only cost a few dollars.
Why are Chinese tokens so cheap?
Extreme optimization of model architecture:Optimizations made by companies like DeepSeek and MiniMax in MoE (Hybrid Expert Model) and sparse attention mechanisms have significantly reduced inference costs.
The “volume-for-price” overseas expansion strategy:With limited computing power, domestic manufacturers must improve the output efficiency of individual chips to spread sunk costs across massive overseas token orders.
The spillover of involution in the domestic market:Price wars have become the norm in the domestic API market. This resilience to survive on extremely low profit margins has created a “disruptive advantage” in overseas markets.
The overlooked killer app: Cross-border arbitrage of electricity costs
But if you only see the price war, you’ve only seen half the story. The real killer feature of Chinese tokens lies in a more fundamental logic:Cross-border delivery of electricity value。
This concept sounds a bit abstract, so let’s break it down.
What happens when an American developer calls the MiniMax API in San Francisco? Data travels from California, via a Pacific undersea fiber optic cable, to a data center in China. The GPU performs inference calculations there, and the results are then transmitted back to the United States.
Throughout the entire process, the electricity never left China’s power grid, but the value of the electricity was delivered across borders through tokens.
This is not a metaphor; it is real business logic.
Breaking down the costs of tokens, the two most critical components are computing power and electricity. Computing power is the depreciation and amortization of GPUs, while electricity is the operating cost of data centers. Electricity costs can account for up to 40% of data center operating costs.
This creates a huge cost scissors gap:
China’s industrial electricity priceDepending on the region and policies, the price is approximately 0.4-0.6 yuan/kWh (about 0.15/kWh or even higher, which is more than 50% higher than in China.
More importantly, China’s power infrastructure is extremely well-developed. Data centers wanting to connect to the grid? While the approval process is complex, as long as they meet the regulations, the power supply is stable and sufficient. In contrast, in the United States, many data centers are facing difficulties in obtaining power, and even Trump has publicly stated his intention to require large technology companies to build their own power plants.
This leads to an interesting phenomenon: while Chinese AI companies face limitations in chip development, they possess a structural advantage in the more fundamental resource of electricity.
This explains why the “computing power shortage” and “token going global” can coexist—Chinese manufacturers are using limited computing power, combined with extreme advantages in electricity costs, to squeeze the output efficiency of every GPU. Behind every token is the Chinese power grid “powering” developers worldwide.
From this perspective, the overseas expansion of tokens is not merely an increase in API call volume; it is essentially…The global export of China’s electricity valueThis is an invisible “new trio”.
Their strength should not be underestimated: From “cheap alternative” to “top-tier combat weapon”.
If it’s just about being cheap, developers won’t buy it. The surge in Chinese tokens also sends a strong signal: domestically developed large-scale models are now capable of handling real-world, complex business scenarios.
More than just casual conversation:According to OpenRouter’s statistics, Chinese models saw the fastest growth in both the Programming and Tool Use categories. This indicates that when handling productivity tasks such as real code and automated scripts, M2.5 or GLM-5 has reached the threshold of being “usable and easy to use”.
The benefits of rapid iteration:Compared to the six-month update cycle of Silicon Valley giants, domestic models have a major version iteration almost every two months. This rapid response to developer feedback allows Chinese models to thrive in the highly flexible market of OpenRouter.
A developer who works on AI agents shared his experience: “We used to test with GPT-4, which cost tens of dollars a day, and it hurt our wallets. Now, with Kimi, the cost has dropped to one-tenth, and the iteration speed is actually faster.”
This is the true position of the Chinese model in the developer ecosystem: not a “good enough” backup plan, but the first choice that is “easy to use and cheap”.
A Sober Victory
We should not mythologize the 61% figure, because it has not changed the global AI computing power landscape; but we should not underestimate it either.
This data proves that at the foundational level of global AI applications, Chinese models are establishing their own niche through “cost-effectiveness + electricity cost advantage + rapid iteration.” While the main battleground for large enterprises remains in the hands of American giants, Chinese tokens are becoming the most affordable and abundant “fuel” for the AI era, fueled by millions of developers worldwide.
This is not an “excessive” replacement; it is a silent flanking attack about efficiency and living space.
OpenRouter’s 2% market share may seem insignificant today, but who can guarantee that these independent developers who are used to MiniMax won’t grow into the CTOs of the next unicorn five years from now? Who can guarantee that the business model that works in the “laboratory” today won’t invade the “central data center” tomorrow?
History tells us that flanking attacks are often more effective than head-on confrontations. Huawei also started from “peripheral markets” like Africa and Southeast Asia, eventually becoming a global telecommunications equipment giant.
The story of AI in China may have only just begun.







