Reddit sues Perplexity for scraping data to train AI system


Perplexity logo is seen in this illustration taken May 20, 2024. REUTERS/Dado Ruvic/Illustration

(Reuters) -Social media platform Reddit sued artificial intelligence startup Perplexity in New York federal court on Wednesday, accusing it and three other companies of unlawfully scraping its data to train Perplexity's AI-based search engine.

Reddit said in the complaint that the data-scraping companies circumvented its data protection measures in order to steal data that Perplexity "desperately needs" to power its "answer engine" system.

The case is one of many filed by content owners against tech companies over the alleged misuse of their copyrighted material to train AI systems. Reddit filed a similar lawsuit against AI startup Anthropic in June that is still ongoing.

"Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest," Perplexity said in a statement.

"AI companies are locked in an arms race for quality human content - and that pressure has fueled an industrial-scale 'data laundering' economy," Reddit chief legal officer Ben Lee said in a statement.

Reddit, which features thousands of interest-based "subreddit" web communities, said in the lawsuit that it is the most commonly cited source for AI-generated answers to user questions. It has licensed its content to Google, OpenAI and others for their AI training.

Reddit said that Lithuania-based Oxylabs, Russia-based AWMProxy and Texas-based SerpApi scraped Reddit data from billions of search results without permission and that Perplexity, which does not have a license to use Reddit content, worked with at least one of the data-scraping companies to obtain Reddit material.

"We strongly disagree with Reddit's allegations and intend to vigorously defend ourselves in court," a SerpApi spokesperson said. Oxylabs said in a statement that it was "shocked and disappointed by this news, as Reddit has made no attempt to speak with us directly," and that it would also defend itself against the allegations.

AWMProxy could not be reached for comment.

Reddit said it sent Perplexity a cease-and-desist letter last year, after which it "increased the volume of citations to Reddit forty-fold."

Reddit asked the court for unspecified monetary damages and an order blocking Perplexity from using its data.

(Reporting by Blake Brittain in Washington; Editing by Nick Zieminski and Stephen Coates)

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

Next In Tech News

'Moltbook' social media site for AI agents had big security hole, cyber firm Wiz says
US firm Aura to buy Australia's Qoria in $675 million deal, relist on ASX
Erli accuses Allegro of price manipulation in Poland e-commerce
Oracle rises as $50 billion raise eases data-center funding fears
Trendforce sees chip prices surging 90-95% in Q1 from previous quarter
Apple loses more AI researchers and a Siri executive in latest departures
A chatbot entirely powered by humans, not artificial intelligence? This Chilean community shows why
From fear to familiarity: empowering Malaysia's seniors in the digital age
Oracle says it plans to raise up to $50 billion in debt and equity this year
X back up after brief outage hits US users, Downdetector shows

Others Also Read