Téléchargez gratuitement notre eBook "Pour une stratégie d'entreprise éco-responsable"
télécharger
French
French
Les opérations de Carve-Out en France
DÉcouvrir
Découvrez le Livre Blanc : "Intelligence artificielle : quels enjeux juridiques"
DÉcouvrir
Intelligence Artificielle : quels enjeux juridiques ?
Actualité
10/6/25

Reddit, Inc. v. Anthropic, PBC – Legal Action over the Unauthorized Use of User-Generated Content in AI Training

A new legal battle has emerged in the growing wave of litigation opposing content rights holders to major AI system developers. The recent action brought by Reddit, Inc. against Anthropic, PBC, adds to a long list of ongoing disputes over the unauthorized use of online content in the development of generative artificial intelligence systems. What makes this case particularly noteworthy is the level of legal sophistication demonstrated by the plaintiff in structuring its claims under U.S. law. Reddit’s complaint exemplifies a maturing legal strategy among platform operators seeking to assert control over their data assets, not only through IP rights but also through a multi-pronged contractual and tort-based approach.

On June 4, 2025, Reddit, Inc. filed a complaint before the Superior Court of California, County of San Francisco, against Anthropic, PBC, alleging a series of unlawful acts arising from the unauthorized extraction and commercial use of Reddit content to train and commercialize Anthropic’s large language model, Claude. Both parties are Delaware corporations headquartered in San Francisco, California. Reddit asserts that the California court has jurisdiction under both statutory provisions and contractual consent, since Anthropic accessed Reddit’s platform subject to its User Agreement, which includes a forum selection clause.

The case touches on critical questions regarding content ownership, contractual consent, the enforceability of terms of use, and the obligations of AI developers in sourcing training data. At the heart of the dispute lies the systematic use of Reddit’s platform—one of the internet’s largest repositories of public discussion forums—by Anthropic to extract user-generated content and integrate it into its generative AI model, Claude, without authorization or compensation.


Factual Background and Core Allegations

Reddit describes itself as the custodian of one of the largest datasets of natural language conversations in the world, consisting of contributions from over 100 million daily users. While the platform operates under an “open internet” ethos, Reddit emphasizes that it never granted carte blanche to commercial actors to extract or exploit that content for profit. The platform’s terms of use explicitly prohibit such activity without a license.

The complaint alleges that Anthropic began scraping Reddit content as early as 2021, using automated bots—including ClaudeBot—to systematically access and extract massive volumes of user content. These acts, Reddit contends, directly violated its User Agreement, which prohibits both unauthorized automated access and the commercial exploitation of Reddit’s services and content. Anthropic is further accused of ignoring Reddit’s anti-scraping technical measures, including its robots.txt file and IP-based rate limits, and of continuing to access Reddit’s servers even after public representations that it had ceased doing so.

Internally, Reddit notes that Anthropic’s technical staff—including its CEO and co-founders—explicitly cited Reddit as a preferred dataset for fine-tuning Claude, identifying dozens of specific subreddits as “high-quality” sources. Claude’s public responses frequently demonstrate knowledge of Reddit-specific content, communities, and discussions—further confirming, according to Reddit, that its data was directly ingested into Anthropic’s training pipeline.

Reddit emphasizes that it has entered into licensing agreements with other major players in the AI space—including OpenAI and Google—under strict privacy and compliance frameworks. These licenses ensure that user deletion requests are respected, especially via Reddit’s Compliance API, which notifies licensees of deleted posts or policy violations. Anthropic, by contrast, refused to negotiate such a license, thereby depriving Reddit and its users of legal, technical, and financial safeguards.

Legal Causes of Action

Reddit brings five causes of action under California law:

First, Reddit asserts breach of contract, arguing that Anthropic accepted the terms of Reddit’s User Agreement each time it accessed the platform, and that it violated core provisions prohibiting scraping and commercial use of the content.

Second, it pleads unjust enrichment in the alternative, contending that Anthropic has built a multibillion-dollar AI product in part on the unauthorized use of Reddit’s intellectual assets, while Reddit has received no compensation and its users have suffered loss of privacy control.

Third, the complaint alleges trespass to chattels, based on Anthropic’s repeated and unauthorized use of Reddit’s servers and infrastructure, which imposed financial and operational burdens without consent.

Fourth, Reddit brings a claim for tortious interference with contractual relations, noting that it holds binding contracts with its users to safeguard their privacy and content control, obligations that Anthropic knowingly undermined by refusing to comply with deletion protocols and continuing to train Claude on potentially deleted posts.

Finally, Reddit alleges unfair competition under California Business and Professions Code § 17200, citing unlawful, unfair, and deceptive practices. Among the accusations are misrepresentations about respecting privacy measures, failure to comply with industry norms, and wrongful appropriation of valuable data for AI commercialization.

Requested Remedies and Implications

Reddit seeks a wide range of remedies, including compensatory damages, disgorgement of Anthropic’s profits, restitution, punitive damages, and a permanent injunction. In particular, Reddit asks the Court to bar Anthropic from using any Reddit-derived content in its commercial offerings and to compel the deletion or withdrawal of models—such as Claude—that were trained on Reddit data.

This lawsuit forms part of a growing body of litigation seeking to define the legal limits of data scraping and model training in the age of generative AI. Reddit’s action underscores the need for AI companies to obtain proper licensing when using user-generated content, especially when such content is subject to privacy rights and explicit contractual terms.

The outcome of this case may have far-reaching consequences, not only for the legal treatment of public internet data but also for the business models of AI developers relying on massive unlicensed datasets to train commercial-grade models.

Vincent FAUCHOUX
Découvrez l'eBook : Les opérations de Carve-Out en France
Télécharger
Découvrez le Livre Blanc : "Intelligence artificielle : quels enjeux juridiques"
Télécharger
Intelligence Artificielle : quels enjeux juridiques ?

Abonnez vous à notre Newsletter

Recevez chaque mois la lettre du DDG Lab sur l’actualité juridique du moment : retrouvez nos dernières brèves, vidéos, webinars et dossiers spéciaux.
je m'abonne
DDG utilise des cookies dans le but de vous proposer des services fonctionnels, dans le respect de notre politique de confidentialité et notre gestion des cookies (en savoir plus). Si vous acceptez les cookies, cliquer ici.