DeepSeek touts new training method as China pushes AI efficiency


Such publications from DeepSeek have foreshadowed the release of major models in the past. — Reuters

DeepSeek published a paper outlining a more efficient approach to developing AI, illustrating the Chinese artificial intelligence industry’s effort to compete with the likes of OpenAI despite a lack of free access to Nvidia Corp chips.

The document, co-authored by founder Liang Wenfeng, introduces a framework it called Manifold-Constrained Hyper-Connections. It’s designed to improve scalability while reducing the computational and energy demands of training advanced AI systems, according to the authors.

Such publications from DeepSeek have foreshadowed the release of major models in the past. The Hangzhou-based startup stunned the industry with the R1 reasoning model a year ago, developed at a fraction of the cost of its Silicon Valley rivals. DeepSeek has since released several smaller platforms but anticipation is mounting for its next flagship system, widely dubbed the R2, expected around the Spring Festival in February.

Chinese startups continue to operate under significant constraints, with the US preventing access to the most advanced semiconductors essential to developing and running AI. Those restrictions have forced researchers to pursue unconventional methods and architectures.

What Bloomberg Intelligence says

DeepSeek’s forthcoming R2 model – which could launch in the next few months – has potential to upend the global AI sector again, despite Google’s recent gains. Google’s Gemini 3 model overtook OpenAI in November to claim a top-3 slot in LiveBench’s ranking of global large language model (LLM) performance. China’s low-cost models, which are developed at a fraction of the cost of competitors, claimed two slots in the top-15.

– Robert Lea and Jasmine Lyu, analysts

DeepSeek, known for its unorthodox innovations, published its latest paper this week through the open repository arXiv and open-source platform Hugging Face. The paper lists 19 authors, with Liang’s name appearing last. 

The founder, who’s consistently steered DeepSeek’s research agenda, has pushed his team to rethink how large-scale AI systems are conceived and built.

The latest research addresses challenges such as training instability and limited scalability, noting that the new method incorporates "rigorous infrastructure optimisation to ensure efficiency.”

Tests were conducted on models ranging from 3 billion to 27 billion parameters, building on ByteDance Ltd’s 2024 research into hyper-connection architectures.

The technique holds promise "for the evolution of foundational models,” the authors said. – Bloomberg

 

 

 

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

Next In Tech News

How Chinese robotaxi giants are steering the Middle East towards a driverless future
A Kenyan barber who wields a sharpened shovel thrives on Africa's social media craze
New app helps patients, families navigate Alzheimer’s in the US
Opinion: Google’s chess master is working on AI's killer app
Samsung Electronics says customers praised competitiveness of HBM4 chip
Starlink plans to lower satellite orbit to enhance safety in 2026
Bye, bye standby: Simple steps to stop hidden power guzzlers at home
Modern car safety assistance systems: How do they make driving safer?
10 pieces of tech jargon that confused us in 2025
Neuralink plans 'high-volume' brain implant production by 2026, Musk says

Others Also Read