Anthropic wins ruling on AI training in copyright lawsuit but must face trial on pirated books


The Anthropic website and mobile phone app are shown in this photo, in New York, July 5, 2024. — AP Photo/Richard Drew, File

In a test case for the artificial intelligence industry, a federal judge has ruled that AI company Anthropic didn’t break the law by training its chatbot Claude on millions of copyrighted books.

But the company is still on the hook and must now go to trial over how it acquired those books by downloading them from online "shadow libraries” of pirated copies.

US District Judge William Alsup of San Francisco said in a ruling filed late Monday that the AI system's distilling from thousands of written works to be able to produce its own passages of text qualified as "fair use” under US copyright law because it was "quintessentially transformative.”

"Like any reader aspiring to be a writer, Anthropic’s (AI large language models) trained upon works not to race ahead and replicate or supplant them – but to turn a hard corner and create something different,” Alsup wrote.

But while dismissing a key claim made by the group of authors who sued the company for copyright infringement last year, Alsup also said Anthropic must still go to trial in December over its alleged theft of their works.

"Anthropic had no entitlement to use pirated copies for its central library,” Alsup wrote.

A trio of writers – Andrea Bartz, Charles Graeber and Kirk Wallace Johnson – alleged in their lawsuit last summer that Anthropic's practices amounted to "large-scale theft," and that the San Francisco-based company "seeks to profit from strip-mining the human expression and ingenuity behind each one of those works.”

Books are known to be important sources of the data – in essence, billions of words carefully strung together – that are needed to build large language models. In the race to outdo each other in developing the most advanced AI chatbots, a number of tech companies have turned to online repositories of stolen books that they can get for free.

Documents disclosed in San Francisco's federal court showed Anthropic employees' internal concerns about the legality of their use of pirate sites. The company later shifted its approach and hired Tom Turvey, the former Google executive in charge of Google Books, a searchable library of digitised books that successfully weathered years of copyright battles

With his help, Anthropic began buying books in bulk, tearing off the bindings and scanning each page before feeding the digitised versions into its AI model, according to court documents. But that didn't undo the earlier piracy, according to the judge.

"That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft but it may affect the extent of statutory damages,” Alsup wrote.

The ruling could set a precedent for similar lawsuits that have piled up against Anthropic competitor OpenAI, maker of ChatGPT, as well as against Meta Platforms, the parent company of Facebook and Instagram.

Anthropic – founded by ex-OpenAI leaders in 2021 – has marketed itself as the more responsible and safety-focused developer of generative AI models that can compose emails, summarise documents and interact with people in a natural way.

But the lawsuit filed last year alleged that Anthropic’s actions "have made a mockery of its lofty goals” by building its AI product on pirated writings.

Anthropic said Tuesday it was pleased that the judge recognised that AI training was transformative and consistent with "copyright’s purpose in enabling creativity and fostering scientific progress.” Its statement didn't address the piracy claims.

The authors' attorneys declined comment. – AP

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

Next In Tech News

SpaceX expected to launch Starlink services in Vietnam from fourth quarter
TSMC quarterly profit seen soaring to record but Trump tariffs, forex a concern
EU Commission launches online age verification app for kids
AI dating app�features aren’t landing with Gen Z, new survey finds
Elmo's X account gets hacked, posts antisemitic and racist messages
Cognition AI to buy Windsurf, doubling down on AI-driven coding
Trump to unveil $70 billion in AI and energy investments
US defense department awards contracts to Google, Musk's xAI
Meta's Zuckerberg pledges hundreds of billions for AI data centers in superintelligence push
Bitcoin rally driven more by institutional demand than speculation

Others Also Read