Téléchargez gratuitement notre eBook "Pour une stratégie d'entreprise éco-responsable"
télécharger
French
French
Les opérations de Carve-Out en France
DÉcouvrir
Découvrez le Livre Blanc : "Intelligence artificielle : quels enjeux juridiques"
DÉcouvrir
Intelligence Artificielle : quels enjeux juridiques ?
Actualité
21/7/25

First U.S. Class Action on Generative AI: Northern District of California Certifies Partial Class Against Anthropic PBC

On July 17, 2025, the United States District Court for the Northern District of California issued a landmark ruling authorizing the first Class Action in the United States challenging the unauthorized use of copyrighted works to train generative artificial intelligence models. This litigation pits a group of prominent authors and copyright holders against Anthropic PBC, a major player in the generative AI sector.

Parties to the Litigation: Renowned Authors and Copyright Holders vs. a Leading AI Company

The plaintiffs are distinguished authors and their corporate entities:

  • Andrea Bartz, author of bestselling novels The Lost Night and The Herd, proceeding individually and on behalf of her company Andrea Bartz Inc.
  • Charles Graeber, journalist and author of acclaimed works such as The Good Nurse and The Breakthrough.
  • Kirk Wallace Johnson, essayist and founder of MJ + KJ Inc., known for To Be A Friend Is Fatal and The Feather Thief.

They claim ownership of exclusive reproduction rights in their respective works, exploited individually and through corporate entities. Acting as class representatives, they seek redress not only for their own works but also for those of similarly situated copyright owners whose works were allegedly infringed.

The defendant, Anthropic PBC, is a U.S.-based artificial intelligence company developing large language models (LLMs), including the Claude family of models. Founded in 2021, Anthropic is among the leading firms in the AI industry, alongside OpenAI and Google DeepMind. The plaintiffs allege that Anthropic unlawfully reproduced millions of copyrighted works without authorization or compensation, using them to train its LLMs.

Factual Background: Use of Pirated Books to Train AI Models

According to the plaintiffs, between 2021 and 2022, Anthropic downloaded massive quantities of copyrighted books from unauthorized sources to build its training datasets. Specifically, the plaintiffs allege that Anthropic:

  • in 2021, downloaded approximately 196,640 unauthorized copies from Books3,
  • later downloaded over 5 million books from Library Genesis (LibGen) using BitTorrent,
  • and obtained nearly 2 million books from PiLiMi, a mirror site of the notorious Z-Library.

The plaintiffs assert that Anthropic stored these pirated copies in its internal library and integrated them into training corpora for its LLMs. They argue that metadata (ISBN, ASIN, MD5 hashes) embedded in the downloaded files can reliably identify the infringed works.

Legal Grounds and Arguments

The plaintiffs brought suit under the Copyright Act (17 U.S.C. §§ 101 et seq.), alleging:

  • unauthorized reproduction and storage of their copyrighted works in violation of their exclusive rights under 17 U.S.C. § 106(1),
  • commercial use of these works for training LLMs without license or remuneration, and
  • infringement on a scale that warrants class-wide relief.

Anthropic opposed class certification, arguing:

  • the heterogeneity of the works and copyright owners precludes a cohesive class,
  • some downloaded files are incomplete or corrupted, making infringement determinations uncertain, and
  • technical obstacles to identifying every work allegedly copied prevent the court from managing a class action effectively.

Court’s Reasoning: Pragmatism and Protection of Copyright Holders

In his opinion, Judge William Alsup rejected Anthropic’s objections and granted partial certification for a class limited to works downloaded from LibGen and PiLiMi. The certified class is defined as:

“All legal and beneficial owners of copyrights in works registered with the Copyright Office within five years of first publication that Anthropic downloaded from Library Genesis (LibGen) or PiLiMi.”

The court held that:

  • identification of the downloaded works is feasible using metadata such as ISBNs, ASINs, and MD5 hashes,
  • common questions of law and fact predominate over individual issues, including ownership, infringement, and statutory damages, and
  • a class action is the superior method for adjudicating claims arising from Anthropic’s systemic use of pirated materials.

The court denied certification for two other proposed classes – the “Books3 Pirated Books Class” and the “Scanned Books Class” – due to insufficient evidence regarding file quality and copyright ownership.

Relief Ordered by the Court

The court ordered that:

  • The LibGen & PiLiMi Pirated Books Class be certified.
  • Notice be provided to copyright owners listed on registrations for works identified in the downloaded datasets.
  • Plaintiffs submit by September 1, 2025, a comprehensive list of affected works, supported by registration certificates and metadata.

Commentary: A Groundbreaking Precedent with Global Implications

This decision represents a critical milestone in regulating the use of copyrighted works to train AI systems. By certifying this Class Action, the federal court signals to the AI industry that large-scale ingestion of copyrighted materials without proper authorization may trigger collective legal remedies.

The ruling may inspire similar actions in other jurisdictions and contribute to shaping international norms for the lawful and ethical use of protected works in AI training datasets.

This article was prepared by a French lawyer specializing in intellectual property and artificial intelligence. For legal advice on U.S. law, consultation with a qualified local attorney is recommended.

Vincent FAUCHOUX
Image par Canva
Découvrez l'eBook : Les opérations de Carve-Out en France
Télécharger
Découvrez le Livre Blanc : "Intelligence artificielle : quels enjeux juridiques"
Télécharger
Intelligence Artificielle : quels enjeux juridiques ?

Abonnez vous à notre Newsletter

Recevez chaque mois la lettre du DDG Lab sur l’actualité juridique du moment : retrouvez nos dernières brèves, vidéos, webinars et dossiers spéciaux.
je m'abonne
DDG utilise des cookies dans le but de vous proposer des services fonctionnels, dans le respect de notre politique de confidentialité et notre gestion des cookies (en savoir plus). Si vous acceptez les cookies, cliquer ici.