A government-funded artificial intelligence (AI) institute in Beijing unveiled on Monday the world’s most sophisticated natural language processing (NLP) model, surpassing those from Google and OpenAI, as China seeks to increase its technological competitiveness on the world stage.
The WuDao 2.0 model is a pre-trained AI model that uses 1.75 trillion parameters to simulate conversational speech, write poems, understand pictures and even generate recipes. The project was led by the non-profit research institute Beijing Academy of Artificial Intelligence (BAAI) and developed with more than 100 scientists from multiple organisations.
Parameters are variables defined by machine learning models. As the model evolves, parameters are further refined to allow the algorithm to get better at finding the correct outcome over time. Once a model is trained on a specific data set, such as samples of human speech, the outcome can then be applied to solving similar problems.
In general, the more parameters a model contains, the more sophisticated it is. However, creating a more complex model requires time, money, and research breakthroughs.
In an era of fast-evolving AI models, BAAI researchers claim to have broken the record set in January by Google’s Switch Transformer, which has 1.6 billion parameters. OpenAI’s GPT-3 model made waves last year when it was released with 175 billion parameters, making it the largest NPL model at the time.
WuDao 2.0 covers both Chinese and English with skills acquired by studying 2.5 terabytes of images and texts, including 1.2 terabytes of Chinese texts. It already has 22 partners, including smartphone maker Xiaomi, on-demand delivery service provider Meituan and short-video giant Kuaishou.
“These sophisticated models, trained on gigantic data sets, only require a small amount of new data when used for a specific feature because they can transfer knowledge already learned into new tasks, just like human beings,” said Blake Yan, an AI researcher from Beijing.
“Large-scale pre-trained models are one of today’s best shortcuts to artificial general intelligence,” he added, using a term for the hypothetical ability of a machine to learn any task that a human can.
Such models act as strategic infrastructure for AI development, said Zhang Hongjiang, chairman of BAAI, on Monday while announcing the project. They are like power plants using data as their fuel, he added, generating intelligence to support AI applications.
China and the US are currently in a race to produce the next generation of sophisticated technologies. China has lagged behind in areas of strategic importance, but the government has been ploughing resources into new technologies that include AI, 5G and semiconductors to help close the gap with its rival, to varying degrees of success.
BAAI is funded by the Beijing government, which put 340mil yuan (RM219.53mil) into the academy in 2018 and 2019 alone, pledging to continue its support, a Beijing official said in a 2019 speech.
The US, sensing that it might be losing its edge, is fighting to stay ahead, with President Joe Biden asking Congress to back a US$13.5bil (RM55.65bil) increase in total federal spending on research and development. Earlier this month, a Senate panel also approved the Endless Frontier Act, pending legislation that would authorise more than US$110bil (RM453.50bil) for basic and advanced technological research over five years.
Last year, the previous administration under former US president Donald Trump unveiled plans to invest US$1bil (RM4.12bil) in AI and quantum science.
A March report by the US National Security Commission on Artificial Intelligence, which includes former Google CEO Eric Schmidt as a chairman along with representatives from other major tech firms, identified China as a potential threat to American AI supremacy. The Rand Corporation think tank also warned last year that Beijing’s focus on AI has helped it substantially narrow its gap with the US, attributing the country’s “modest lead” to its dominant semiconductor sector. – South China Morning Post