Adobe’s ‘ethical’ Firefly AI was trained on Midjourney images

By Rachel Metz and Brody Ford

AI
Monday, 15 Apr 2024
3:30 PM MYT

Related News

Indonesia 2h ago

Asean to establish AI safety initiative

AI 4h ago

AI is replacing search engines as a shopping guide, research suggests

Nation 5h ago

AI can unlock over US$113bil in productivity gains for Malaysia, says Gobind

An Adobe employee walking the crowd through Adobe Firefly Custom Models and Adobe Firefly Services during the opening keynote at Adobe Summit on March 26, 2024 in Las Vegas. Adobe never made clear publicly that Firefly had trained in part on images from competitors’ tools that are supposedly less ethical. — AP Images for Adobe

When Adobe Inc released its Firefly image-generating software last year, the company said the artificial intelligence model was trained mainly on Adobe Stock, its database of hundreds of millions of licensed images. Firefly, Adobe said, was a “commercially safe” alternative to competitors like Midjourney, which learned by scraping pictures from across the Internet.

But behind the scenes, Adobe also was relying in part on AI-generated content to train Firefly, including from those same AI rivals. In numerous presentations and public posts about how Firefly is safer than the competition due to its training data, Adobe never made clear that its model actually used images from some of these same competitors.

Massive amounts of data are needed to train AI models underlying popular content creation products, and there is increasing scrutiny on AI technology companies over their use of copyrighted materials in this process. Companies like Midjourney, Dall-E maker OpenAI and Stable Diffusion maker Stability AI built their media-generating models with datasets that pull imagery from across the Internet, a practice that has led to outrage and lawsuits from a number of artists.

“This shows the murkiness of the definition of responsible AI, and it also illustrates the difficulties of getting away from, if not the legal, then the social and cultural problems, or ethical problems, with generated content,” said Luke Stark, an assistant professor at Western University in Ontario, who studies the social and ethical impacts of AI.

Adobe’s decision to build Firefly with content the company holds the rights to and that in the public domain was meant to differentiate its AI image tool in the fast-growing market for generative artificial intelligence. The company promoted it as a more ethical, legally sound option for customers interested in conjuring images from just a few words but wary of potential copyright issues. It won’t generate content based on the intellectual property of other people or brands, Adobe has said, and will avoid producing harmful images, too.

AI-generated content made it into Firefly’s training set because creators were allowed to submit millions of images into Adobe’s stock marketplace that used the technology from other companies. “Generative AI images from the Adobe Stock collection are a small part of the Firefly training dataset,” wrote Adobe representative Michelle Haarhoff in September on a Discord group for photographers and artists who contribute to the marketplace.

Adobe said a relatively small amount – about 5% – of the images used to train its AI tool was generated by other AI platforms. “Every image submitted to Adobe Stock, including a very small subset of images generated with AI, goes through a rigorous moderation process to ensure it does not include IP, trademarks, recognisable characters or logos, or reference artists’ names,” a company spokesperson said.

StarPicks

STRENGTHENING WORKERS’ RIGHTS, ENHANCING PRODUCTIVITY

Criticism of the practice has come from inside the company: Since the early days of Firefly, there has been internal disagreement on the ethics and optics of ingesting AI-generated imagery into the model, according to multiple employees familiar with its development who asked not to be named because the discussions were private. Some have suggested weaning the system off generated images over time, but one of the people said there are no current plans to do so.

Adobe has taken shots at competitors over their data collection practices. Other models are built on data that is “openly scraped”, Chief Strategy Officer Scott Belsky said last year. One way that Firefly is better than OpenAI’s comparable model is because it shows respect for the creative community by training only on licensed or freely available data, Adobe says on its website. And in a blog post last March titled “Responsible Innovation in the Age of Generative AI,” general counsel Dana Rao pointed out that generative AI “is only as good as the data on which it’s trained.”

“Training on curated, diverse datasets inherently gives your model a competitive edge when it comes to producing commercially safe and ethical results,” he wrote, while pointing out that Adobe trained Firefly on Adobe stock images, licensed content and public domain content in which the copyright has run out.

“Our enterprise customers came to us when we launched Firefly and said, ‘We love what you’re doing, we really appreciate that you’re not stealing all of our intellectual property out on the open Internet’,” Ashley Still, an Adobe senior vice president, said earlier this month during a Bloomberg Intelligence event.

Still, Adobe never made clear publicly that Firefly had trained in part on images from competitors’ tools that are supposedly less ethical. It did, however, outline such details in at least two online discussion groups the company runs on Discord – one for Adobe Stock and another devoted to Firefly – according to messages Bloomberg has viewed.

In March 2023, Adobe unveiled Firefly as a “beta” product. That month, Raúl Cerón, who works with the Adobe Stock community, posted on Discord that the company wasn’t planning to use generated images to train the forthcoming public version of Firefly.

“Once we go live out of beta, we will have a new training database for it, leaving Gen AI content out of it,” he wrote in a post in June.

When Adobe announced the public release of Firefly on Sept 13, the company also paid a special “Firefly bonus” to Adobe Stock contributors “whose content was used to train the first commercial Firefly model”. Contributors who used generative AI were among those who received the bonus payment, according to a Discord message from Mat Hayward, who also works with the Adobe Stock community.

AI-generated imagery in Adobe Stock “enhances our dataset training model, and we decided to include this content for the commercially released version of Firefly,” Hayward wrote.

Brian Penny, a writer and stock image contributor who has submitted thousands of AI-generated images – mostly made with Midjourney – to Adobe Stock, was surprised to get the bonus. He figured as an AI contributor he wouldn’t be eligible. Despite the financial gain, Penny thinks the decision to train Firefly on content such as his is a bad one, and said the company should be more candid about how it’s training the software for creating images.

“They need to be ethical, they need to be more transparent, they need to do more,” he said.

Adobe Stock’s library has boomed since it began formally accepting AI content in late 2022. Today, there are about 57 million images, or about 14% of the total, tagged as AI-generated images. Artists who submit AI images must specify that the work was created using the technology, though they don’t need to say which tool they used. To feed its AI training set, Adobe has also offered to pay for contributors to submit a mass amount of photos for AI training – such as images of bananas or flags.

Training on AI-generated content probably wouldn’t make Adobe’s Firefly image generator less commercially safe, and the company isn’t required to say what it’s training on as long as it isn’t misleading consumers, said Harvard professor Rebecca Tushnet, who focuses on copyright and advertising law. But training on AI images, such as those created by Midjourney, undermines the idea that Firefly is unique from competing services, she said.

"Adobe basically wants to position itself as the superior alternative, but it also wants really cheap inputs, and AI is a really good way to get cheap inputs,” she said. – Bloomberg

Topic:

AI Internet

Is this article useful?

Report a mistake

What is the issue about?

Spelling and grammatical error

Factually incorrect

Story is irrelevant

Email (optional)

Thank you for your report!

Next In Tech News

AI is replacing search engines as a shopping guide, research suggests

Who is watching all these podcasts?

AI can unlock over US$113bil in productivity gains for Malaysia, says Gobind

Review: A turn-of-the-millenium road trip in 'Keep Driving'

Celebrate 40 years of the Amiga by playing the classic games here

Tesla updates robotaxi users about chauffeur-style service in California, Business Insider reports

Women's dating app Tea reports 72,000 images stolen in security breach

‘It’s the most empathetic voice in my life’: How AI is transforming the lives of neurodivergent people

Opinion: Has Musk lied about self-driving Teslas? California says so

People are starting to talk more like ChatGPT

Others Also Read

Soccer-Man Utd boss Amorim hails 'really important' Fernandes

Football22m ago

Symbol	Open	High	Low	Last	Chg	%Chg	Vol ('00)
HSI-PWHI	0.195	0.210	0.195	0.205	0.015	7.89	1,936,176
HSI-CWGX	0.250	0.250	0.230	0.230	-0.035	-13.21	1,730,445
HSI-CWGI	0.170	0.170	0.155	0.155	-0.030	-16.22	1,700,156
HSI-PWHV	0.180	0.200	0.175	0.185	0.010	5.71	1,400,245
HSI-CWGS	0.140	0.140	0.120	0.120	-0.025	-17.24	1,397,471
NEXG	0.515	0.530	0.515	0.525	0.010	1.94	996,383
FOCUS	0.010	0.010	0.005	0.010	0.000	0.00	868,755
ZETRIX	0.915	0.920	0.905	0.910	-0.005	-0.55	607,583
HSI-PWJF	0.140	0.150	0.140	0.145	0.005	3.57	490,787
TANCO	0.915	0.920	0.910	0.915	0.000	0.00	462,506
VELESTO	0.185	0.190	0.185	0.185	0.005	2.78	455,496
LCTITAN	0.700	0.730	0.675	0.695	-0.010	-1.42	381,126
ZETRIX-C9K	0.070	0.070	0.070	0.070	-0.080	-53.33	271,087
PHARMA	0.220	0.225	0.215	0.220	0.000	0.00	202,221
TENAGA	13.860	13.920	13.420	13.600	-0.340	-2.44	195,902

Adobe’s ‘ethical’ Firefly AI was trained on Midjourney images

STRENGTHENING WORKERS’ RIGHTS, ENHANCING PRODUCTIVITY

Next In Tech News

Others Also Read

Soccer-Man Utd boss Amorim hails 'really important' Fernandes

Swimming-US team battling gastroenteritis at world championships

Cambodia facing displacement crisis amid border conflict: Think-tank

INTERACTIVE: Fake or fact? Only three in 10 Malaysians verify info online

HONOURING 30 YEARS OF EXTRAORDINARY MALAYSIANS

RM10mil allocated for people-centric programmes in Muar, says Anwar

Cops question trio over RM1.8mil burglary at Dr M's grandchild's house

Cambodia-Thailand conflict: Monks, dancers and volunteers offer respite as violence escalates

Mother and baby killed in Kerteh crash

Dog Talk: Why do dogs fear trucks, thunder, and fireworks? A Malaysian vet explains

Let it stylishly hang loose: Untucked shirts are all the rage right now

Final minutes of Jeju Air flight before South Korea's deadliest air disaster

StarPicks

HONOURING 30 YEARS OF EXTRAORDINARY MALAYSIANS

Allianz and p-hailing platforms unite for rider safety

Market Summary

FBM KLCI

28,594,272

Market Movers

Want to listen to full audio?

Thank you for downloading.

Adobe’s ‘ethical’ Firefly AI was trained on Midjourney images

Related News

Related stories:

Related News

Next In Tech News

Others Also Read

Trending in Tech

Market Summary

FBM KLCI

28,594,272

Want to listen to full audio?

Thank you for downloading.