DeepSeek technique to improve AI’s ability to ‘read’ long texts questioned by new research


AI models face a critical limitation known as the long-context bottleneck, which restricts their ability to process lengthy documents. — SCMP

A group of researchers from China and Japan has challenged a method unveiled several months ago by Chinese artificial intelligence start-up DeepSeek that was designed to improve AI’s ability to handle long blocks of text, marking a rare case of the company’s research being publicly questioned.

The DeepSeek-OCR (optical character recognition) method, designed to compress text by using visual representations, potentially revolutionising how AI models handle long texts, was flawed due to inconsistent performance, according to researchers from Japan’s Tohoku University and the Chinese Academy of Sciences.

In their study, titled “Visual Merit or Linguistic Crutch? A Close Look at DeepSeek-OCR”, the research team found that the start-up’s method relied heavily on language priors – the tendency of AI models to draw on patterns learned from large volumes of text – rather than the visual understanding it claimed, making performance metrics reported by the Chinese company “misleading”.

AI models faced a critical limitation known as the long-context bottleneck, which restricted their ability to process lengthy documents or extended conversations, the researchers noted.

Improvements in this area, which would lead to a performance leap in an AI system, have been sought by companies and research institutes worldwide.

A DeepSeek display at an AI fair in Hangzhou, east China’s Zhejiang Province, May 4, 2025. Photo: Xinhua

The DeepSeek-OCR technique, published in October, was said to be able to handle large and complex documents by using visual perception as a compression medium.

“Vision-context compression can achieve significant token reduction – seven to 20 times...offering a promising direction” for addressing long-context challenges in AI, the company said at the time.

However, in a series of carefully designed experiments, the new research found that DeepSeek-OCR’s visual question answering accuracy dropped to around 20% when given access to additional text to sway its reasoning, compared with over 90% accuracy for standard AI models.

The researchers said the gap “ultimately questions whether current optical-compression approaches represented a viable path towards solving [AI models’] long-context limitations” and suggested alternative strategies could be needed.

DeepSeek did not immediately respond to a request for comment on Monday.

Some computer scientists, however, described DeepSeek-OCR as more of a double-edged sword than a fundamental flaw, as there was no silver bullet for all situations.

Li Bojie, a computer science PhD from the University of Science and Technology of China, who now runs his own AI start-up in Beijing, said that for barely discernible manuscripts, reliance on learned knowledge could help AI figure out the text, but for clearly printed material it could be a disadvantage.

“You could say [the method] has both its advantages and a disadvantage,” Li said. – South China Morning Post 

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

Next In Tech News

Gates and OpenAI team up for AI health push in African countries
AI agents ‘perilous’ for secure apps such as Signal, Whittaker says
L'oreal to invest $383 million in Indian beauty tech hub
OpenAI to start offering chatbot ads to advertisers, The Information reports
Philippines to restore access to Grok after developer commits to safety fixes
OpenAI to unveil chatbot ads to its advertisers, The Information reports
OpenAI unveils plan to keep data-center energy costs in check
Chinese-owned Temu catches up with Amazon in global cross-border e-commerce
Five things too much screen time is doing to your child’s brain
Tesla's Cybercab, Optimus output to start 'agonizingly slow', ramp up later, Musk says

Others Also Read