Alibaba reveals progress with large language model research as Chinese Big Tech firms continue to push for ChatGPT rival


By Ann Cao

A group of researchers from DAMO Academy have unveiled a new audiovisual language model called Video-LLaMA. The new DAMO Academy model is an enhancement from previous vision-LLMs as it can tackle two challenges in video understanding. — SCMP

Alibaba Group Holding’s in-house research unit is making progress with its own large language models (LLMs), as Chinese Big Tech companies continue to pile into the artificial intelligence (AI) space in an attempt to come up with a rival to OpenAI’s ChatGPT.

A group of researchers from DAMO Academy unveiled a new audiovisual language model called Video-LLaMA, which helps the system to understand visual and auditory content in videos, in a research paper published last week on ArXiv, an online scientific paper repository.

The codes have also been open-sourced by the researchers on online developer community GitHub. Alibaba owns the South China Morning Post.

LLMs, which are trained through machine learning, are the underpinning of AI-powered chatbots like ChatGPT. LLMs allow the chatbots to answer sophisticated queries, generate detailed writings, code, or other content.

The new DAMO Academy model is an enhancement from previous vision-LLMs as it can tackle two challenges in video understanding: capturing the temporal changes in visual scenes and integrating audiovisual signals, according to the three researchers, Zhang Hang, Li Xin and Bing Lidong.

In a case demonstrated by the researchers, when given a video of a man playing saxophone on stage, the model was able to describe in text both the background sound of applause and visual content of the video. By comparison, previous models, such as MiniGPT-4 and LLaVA, mainly focus on static image comprehension, the researchers said.

Meanwhile, the researchers noted that the model is still “an early-stage prototype” with a few limitations, such as its limited ability to handle long videos including films and TV shows.

The move comes as a part of broader efforts by Alibaba, which is in the midst of its largest-ever corporate restructuring, to double down on its investment in the development and application of LLMs.

Alibaba’s cloud unit in April unveiled its own alternative to ChatGPT – Tongyi Qianwen – which is based on DAMO’s LLMs, marking one of the earliest Chinese companies to join the ChatGPT bandwagon, along with search engine giant Baidu which launched its Ernie Bot in March. The service had received more than 200,000 beta testing applications from corporate clients, Alibaba chairman and CEO Daniel Zhang Yong said in a conference call with analysts last month.

DAMO first introduced its LLM called AliceMind last September, when deputy head Zhou Jingren unveiled it at the World AI Conference in Shanghai. He described it as a multimodal pre-trained language model that is able to process different types of inputs including text, images, audio, and video.

Alibaba has started to work with partners to develop industry-specific AI models, Zhang said. For instance, it is planning to launch cloud products and enterprise solutions based on its AI model, and integrate AI capabilities into various products, including its workplace collaboration tool DingTalk. – South China Morning Post

Follow us on our official WhatsApp channel for breaking news alerts and key updates!
   

Next In Tech News

GameStop and AMC surge evokes 2021 meme stocks saga
Waymo’s robotaxis make 50,000 trips per week in the US
US opens probe into Alphabet's Waymo over performance of self-driving vehicles
Dutch fine Fortnite maker for ‘pressuring’ kids with ads
‘Digital Prison’: Site that names and shames convicts and suspects sparks debate in South Korea
TSMC says work on European plant on track to start in fourth quarter
Chinese gaming giants Tencent, NetEase to release new titles back to back in sign of intensified rivalry
China breakthrough could make ‘fault-tolerant’ quantum computing a reality
Facebook use may have helped Trump with some voters, report says
Rakuten logs 15th quarter of losses on mobile woes despite record financial unit profit

Others Also Read