DeepSeek stays mum on next AI model release as technical papers show frontier innovation


Hangzhou-based start-up's technical papers bode well for potential advances in its next major AI models. — SCMP

Chinese artificial intelligence firm DeepSeek continues to keep the world guessing on when its next major release – the much-anticipated updates to its V3 and R1 models – will be launched, according to analysts, amid its recent publication of technical papers.

The papers underscored DeepSeek’s efforts to improve the underlying infrastructure of AI systems in China at a time when geopolitical tensions and domestic production hurdles restricted the country’s access to advanced semiconductors to train new models, according to Zhang Ruiwang, a Beijing-based information systems architect working in the internet sector.

“DeepSeek just wants to prove that AI infrastructure innovation would drive efficiency and further scale up performance of models,” Zhang said.

Still, speculation swirled across the global AI community about potential delays in the Hangzhou-based start-up’s launch of its next-generation V4 and R2 models, which would succeed V3, introduced in December 2024, and R1, which was released in January last year.

DeepSeek declined to comment on reports that V4 would be released during the Lunar New Year.

Apart from US restrictions on China’s access to cutting-edge graphics processing units (GPUs) and advanced chipmaking equipment, the world’s AI developers currently face a shortage of memory chips amid high demand for these semiconductors in enterprise data centres.

People are seen learning about DeepSeek’s technologies at an artificial intelligence-themed fair in Hangzhou, the capital of eastern Zhejiang province, on May 4, 2025. Photo: Xinhua

DeepSeek on Tuesday published a new technical paper that proposed a novel technique to train models, which researchers said could facilitate “aggressive parameter expansion” by bypassing GPU memory constraints.

This paper introduced a “conditional memory” technique, called Engram, to address a key bottleneck of scaling up AI models: the limited capacity of GPU high-bandwidth memory.

That followed DeepSeek’s release of an updated technical paper about R1, which featured the contributions of its 18 core scientists.

The paper suggested that DeepSeek retained all 18 scientists behind its AI model development efforts, as well as many of the R1 project’s 176 contributors, despite fierce competition for talent in China’s AI industry.

On the last day of 2025, DeepSeek published a technical paper, with founder and CEO Liang Wenfeng among the 19 co-authors, about “manifold-constrained hyper-connections” (mHC) – a general framework for training AI systems at scale, which suggested “promising directions for the evolution of foundational models”.

The mHC method forms part of DeepSeek’s push to make its AI models more cost-effective, as it strives to keep pace with better-funded US rivals with greater access to advanced computing power.

According to Zhang, these DeepSeek innovations could be a boon for Chinese developers with limited computing and capital resources, as the company focused on going back to the basics and improving these processes.

The firm’s website listed eight foundational models, including new V3 versions – V3.2 and V3.2-Speciale – that were launched last month.

If DeepSeek released next-generation models, either V4 or R2, around the Lunar New Year, Zhang said that the company was not expected to stun the world like it did last year.

“What would be more important was that a Chinese company delivered new breakthroughs to bring AI to the hands of everyone,” Zhang said.

When British journal Nature last month profiled DeepSeek’s Liang as part of its top 10 “people who shaped science in 2025”, it recognised the disruption caused by R1’s release in January last year, as the Chinese start-up showed that “the United States was not as far ahead in AI as many experts had thought”. – South China Morning Post

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

Next In Tech News

Swiss competition authority opens probe into Microsoft licensing fees
Opinion: AI chatbots want to be friends – or more – with your kid
Amazon launches new Europe-based cloud service to address user concerns
The AI boom needs electricity, but Western grids are strained. Is power China’s power?
Microsoft in record deal for soil carbon credits as data centres surge
China warns of covert mapping by foreign forces to steal sensitive geodata
Musk's Grok barred from undressing images after global backlash
China teen makes fraudulent returns on cosmetics platform, profiting US$570,000 from resale
Google Gemini can proactively analyse users’ Gmail, photos, searches
US Senate Banking delays crypto bill after Coinbase CEO opposition

Others Also Read