AI models face a critical limitation known as the long-context bottleneck, which restricts their ability to process lengthy documents. — SCMP
A group of researchers from China and Japan has challenged a method unveiled several months ago by Chinese artificial intelligence start-up DeepSeek that was designed to improve AI’s ability to handle long blocks of text, marking a rare case of the company’s research being publicly questioned.
The DeepSeek-OCR (optical character recognition) method, designed to compress text by using visual representations, potentially revolutionising how AI models handle long texts, was flawed due to inconsistent performance, according to researchers from Japan’s Tohoku University and the Chinese Academy of Sciences.
In their study, titled “Visual Merit or Linguistic Crutch? A Close Look at DeepSeek-OCR”, the research team found that the start-up’s method relied heavily on language priors – the tendency of AI models to draw on patterns learned from large volumes of text – rather than the visual understanding it claimed, making performance metrics reported by the Chinese company “misleading”.
AI models faced a critical limitation known as the long-context bottleneck, which restricted their ability to process lengthy documents or extended conversations, the researchers noted.
Improvements in this area, which would lead to a performance leap in an AI system, have been sought by companies and research institutes worldwide.

The DeepSeek-OCR technique, published in October, was said to be able to handle large and complex documents by using visual perception as a compression medium.
“Vision-context compression can achieve significant token reduction – seven to 20 times...offering a promising direction” for addressing long-context challenges in AI, the company said at the time.
However, in a series of carefully designed experiments, the new research found that DeepSeek-OCR’s visual question answering accuracy dropped to around 20% when given access to additional text to sway its reasoning, compared with over 90% accuracy for standard AI models.
The researchers said the gap “ultimately questions whether current optical-compression approaches represented a viable path towards solving [AI models’] long-context limitations” and suggested alternative strategies could be needed.
DeepSeek did not immediately respond to a request for comment on Monday.
Some computer scientists, however, described DeepSeek-OCR as more of a double-edged sword than a fundamental flaw, as there was no silver bullet for all situations.
Li Bojie, a computer science PhD from the University of Science and Technology of China, who now runs his own AI start-up in Beijing, said that for barely discernible manuscripts, reliance on learned knowledge could help AI figure out the text, but for clearly printed material it could be a disadvantage.
“You could say [the method] has both its advantages and a disadvantage,” Li said. – South China Morning Post
