OpenAI unveils technology that can ‘reason’ with images


An illustrative image showing OpenAI’s ChatGPT in San Francisco, March 21, 2025. Unlike early versions of its ChatGPT chatbot, these reasoning systems spend a significant amount of time 'thinking' about a question before answering, rather than providing an instant response. — KELSEY MCCLELLAN/The New York Times

SAN FRANCISCO: In September, OpenAI introduced artificial intelligence technology that could “reason” through tasks involving math, coding and science.

Now, this technology can tackle similar tasks that involve images, including sketches, posters, diagrams and graphs.

On April 16, the company unveiled two new versions of its reasoning technology called OpenAI o3 and OpenAI o4-mini. Each can handle tasks that involve both images and text.

These systems can “manipulate, crop and transform images in service of the task you want to do”, said Mark Chen, head of research at OpenAI, in announcing the new system during an internet livestream.

OpenAI also said these systems could generate images, search the web and use other digital tools.

Unlike early versions of its ChatGPT chatbot, these reasoning systems spend a significant amount of time “thinking” about a question before answering, rather than providing an instant response.

The systems are part of a wider effort to build AI that can reason through complex tasks. Companies like Google, Meta and DeepSeek, a Chinese startup, are developing similar technologies.

The goal is to build systems that can solve a problem through a series of steps, each one building on the last, similar to how humans reason. These technologies can be particularly useful to computer programmers who use AI systems to write code.

The reasoning systems are based on a technology called large language models, or LLMs. To build reasoning systems, companies put LLMs through an additional process called reinforcement learning. During this process, a system learns behavior through extensive trial and error.

By working through various math problems, for instance, it can learn which methods lead to the right answer and which do not. If it repeats this process with a large number of problems, it can identify patterns.

OpenAI’s latest systems have learned to handle problems that involve both images and text.

Experts point out that reasoning systems do not necessarily reason like a human. And like other AI technologies, they can get things wrong and make stuff up – a phenomenon called hallucination.

OpenAI also unveiled a new tool, Codex CLI, that is designed to further facilitate computer programming tasks that involve systems like o3 and o4-mini. Called an AI agent, it provides ways of using these AI systems in tandem with existing code stored on a programmer’s personal machine.

The company said it was open sourcing this tool, meaning it is freely sharing its underlying technology with programmers and businesses, allowing them to modify and build on the technology.

OpenAI said that, beginning Wednesday, these new systems would be available to anyone who subscribed to ChatGPT Plus, a US$20 (RM88)-a-month service, or ChatGPT Pro, a US$200 (RM882)-a-month service that provides access to all of the company’s latest tools.

(The New York Times has sued OpenAI and its partner, Microsoft, for copyright infringement of news content related to AI systems. Both companies have denied the claims). – ©2025 The New York Times Company

This article originally appeared in The New York Times.

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

Next In Tech News

Nike cuts some jobs in technology division
Apple’s thought-detection tech could change life for people with disabilities
Artificial intelligence insurance? This startup in Canada will cover the costs of AI mistakes
Alphabet's Waymo gets California nod for robotaxi expansion in San Francisco Bay Area
US DOJ opens investigation into Coinbase's recent cyberattack
Delta can sue CrowdStrike over computer outage that caused 7,000 canceled flights
Dell unveils new AI servers powered by Nvidia chips to boost enterprise adoption
Russian court fines Apple for violating 'LGBT propaganda' law
Intel's new CEO says the company has 55% of the data centre market
Klarna Q1 revenue up 15% as US business grows

Others Also Read