The story of Chinese LLM Models

2025 started as a very interesting year in the Generative Artificial Intelligence (Gen AI) world.

On 20th January 2025, Deepseek, an AI company based in China, launched its model – DeepseekR1. The company claimed that this model performs reasoning tasks better than the best known LLM in the world – Open AI’s ChatGPT.

The most astonishing aspect of the launch, however, was not the capabilities of the LLM. It was the pricing. Deepseek offered its services for 0.14 dollars per million tokens of input and 0.27 dollars per million tokens of output. (A token is the smallest unit of data exchanged with an AI engine. The question you ask is counted towards token input and the answer is counted towards token output).

At the same time, the o1, which Deepseek compared itself to, was priced at $15 per 1M input tokens and $60 per 1M output tokens. Even the o1 mini was priced at $3 per 1M input tokens and $12 per 1M output tokens.

Older models of ChatGPT were priced lower, but on the day of the launch, none of the options matched Deepseek.

The Stock Market Crash

Deepseek claimed that it trained its models more efficiently. It declared a training cost that was a fraction of Open AI’s known training costs.

This got people thinking – if high end chips are NOT needed for developing AI engines, why are chip companies so highly valued?

NVIDIA, one of the world’s leading chip manufacturers, saw a sharp fall in its stock price. The entire US stock market fell as a result of this announcement.

Out of the Frying pan, into the Fire

In January 2025, some of the issues facing ChatGPT in particular and the Gen AI industry in general were:

A. Lawsuits related to copyright violations

B. Impending government regulations on AI

C. Global concerns over use of personal data, and using Gen AI to create deepfakes.

D. Since ChatGPT was still not able to turn a profit, investors, including Microsoft, were finding it hard to justify the investment to their shareholders.

The idea that a Chinese startup could make Gen AI faster, cheaper, and better, was like jumping from the frying pan into the fire.

Within two days, Open AI claimed that Deepseek had used their data to train its LLM, which was illegal.

Within a week, many countries realised that since the servers are based in China, ALL queries and their results are shared, by default, with the Chinese government.

While it may not be important if your idea of writing an essay about a historical event is shared with the Chinese government, some students and users might share important confidential information.

For example, a startup founder might type a query: “What is the best way to use funds if we raise 0.8 million dollars for our tidal energy startup?” Now, the Chinese government knows that someone in this country is looking to generate tidal wave energy, and the kind of money they are looking for.

These security concerns led to South Korea, Australia, Italy, and Taiwan banning Deepseek. The US Navy, State of Texas, and the State of New York have also banned the solution, while countries in the EU are monitoring the situation.

The Government of India has issued a strong advisory to its staff members to not use Gen AI (both Deepseek and ChatGPT).

And then the floodgates opened

Within days, Chinese LLM models were being launched at an unprecedented pace.

On January 22nd, Bytedance (the company behind TikTok), announced an update to its model – Doubao-1.5-pro.

On the same day, Tencent Holdings launched Hunyuan3D 2.0, another open source high quality 3D graphics generator based on its own LLM. Tencent is the owner of Snapchat, Riot Games, and Wechat. It also holds a 40% stake in Epic Games.

On January 25th, another startup called Moonshot AI launched its multimodal (one that can process text, video, images, etc.) Kimi1.5 with English language support (limited and work in progress).

On January 29th, Alibaba launched Qwen2.5.

On February 5th, realistic AI videos were generated through Omnihuman, a Bytedance product. (this is nothing new. It is suspected that Spotify already uses AI generated music on its platform to save on author royalties).

On February 9th, Deepseek ended its ‘promotional’ pricing and raised its prices. The new prices are still lower than all American companies, but are higher than before.

Can you think of a player that does not figure in this list? Baidu – the creator of the search engine that is the Google of China. While Baidu has major AI investment and some solutions that are already live, it has not released a new solution or a major update in this January-February volley of announcements.

How did China do this?

How were so many announcements possible within weeks? What was going on in China?

What do you need when you want to build new capability? Three things – Supportive policy, the infrastructure, and skilled people. China worked on all three.

In 2017, the government of China announced that China would become the global leader in AI by 2030. Additionally, companies were tasked with creating solutions “such that technologies and applications achieve a world-leading level” by 2025.

In addition, the first policy was drafted in April 2023 and the law was passed by August 2023.

In China, LLMs have to be approved by the government before they can be used by the public.

Infrastructure: Chips

In 2021–2022, 55 percent of global semiconductor patent applications were Chinese in origin (and China’s number of applications doubled America’s) while Chinese entities surpassed U.S. and Japanese ones for semiconductor patents granted in 2022.

According to a White Paper by the Semiconductor Industry Association, China focused on every aspect of chip making – financing, complexity of chips, scale, and relative global performance.

People

According to a report by University of Georgetown, Washington, China had approved AI based degrees. 9 defense-affiliated universities advertised 79 AI job openings across 10
provinces. As this graphic, taken from the report, indicates, most of the education and job postings were in Western China.

Outcome: Models for the future

The news may be sudden, but the ascent has been gradual.

Today, China has over 130 approved models being used by about 600 million people on the mainland.

In February 2024, the world was already taking note of the emerging Gen AI models of China.

Creating an ecosystem

Chinese companies embed each other’s solutions. For example, the day after Deepseek was launched, Baidu indicated that it has integrated Deepseek into its cloud solution.

Increasing adoption

Hugging Face, an online directory of AI models and datasets, has the following models in its “Trending” list at the moment:

But when it comes to most downloads, the historical advantage is evident:

What happens now?

As always, we don’t know. 🙂

Note: The images of stock market crash, frying pan to fire, floodgates, and Chinese plan for the future, are all generated using AI.