Faculty of law blogs / UNIVERSITY OF OXFORD

AI, Fair Use, and Legal Risk: Governance Implications of Bartz v. Anthropi

Posted:

Time to read:

3 Minutes

Author(s):

Hadar Y. Jabotinsky
Founder and Senior Researcher, The Hadar Jabotinsky Center for Interdisciplinary Research of Financial Markets, Crises and Technology (HJC)
Michal Lavi
Senior Research Fellow at The Hadar Jabotinsky Center for Interdisciplinary Research of Financial Markets, Crises and Technology

A recent federal court ruling in Bartz v. Anthropic PBC has significantly shifted the legal terrain for artificial intelligence companies and the broader corporate governance community. While the case directly addresses copyright issues, its implications reach boardrooms, compliance departments, and AI policy stakeholders who must now navigate a more nuanced and potentially volatile regulatory environment.

On June 23, 2025, Judge William Alsup of the Northern District of California issued a closely watched opinion in Bartz v. Anthropic PBC, a copyright case that probes the legal boundaries of training large language models (LLMs) on massive corpuses of text. At the core of the dispute lies a deceptively simple question: can the act of training artificial intelligence systems on copyrighted books qualify as fair use?

The court held that the use of copyrighted works in training LLMs like Claude may, under certain conditions, constitute transformative fair use. Yet the opinion draws a sharp line between such training and the maintenance of a permanent digital repository of pirated books, a practice the court found was not protected by fair use. In so doing, the decision sets a critical precedent that is likely to influence future litigation involving generative AI and copyright law. This boundary-setting has material consequences for companies that develop or deploy LLMs trained on third-party content. 

Judge Alsup's analysis adhered to the familiar four-factor test of 17 U.S.C. § 107. According to the judge, the first factor, the purpose and character of the use, was satisfied, as the court found that training LLMs on text corpora was ‘exceedingly transformative.’ However, the court rejected Anthropic’s contention that its accumulation of a static library of pirated books was similarly protected. This distinction between transformative use for AI training and non-transformative data warehousing is doctrinally significant. It highlights the evolving boundary between innovation and appropriation in the context of machine learning. Not all uses by AI companies are equally defensible; some may advance knowledge and expressive freedom, while others amount to little more than systemic theft. For companies investing in generative AI, this signals that the details of data governance matter. Boards and executive teams will need to ask: how was our model trained? What controls exist to manage downstream copyright risk? Are our data sources properly licensed?

In a prior Article published in the Cardozo Arts & Entertainment Law Journal (titled: Can ChatGPT and the Like Be Your Co-Authors?), we examined the murky intersection between copyright and AI-generated outputs. Current US copyright law recognizes only ‘original works of authorship’ by human creators. This creates fundamental uncertainty when dealing with machine-generated text, which may not qualify for copyright protection, or worse, may infringe on protected material. The Bartz ruling highlights that this debate is no longer theoretical. It is happening in courtrooms, and its outcome will determine both compliance strategies and innovation trajectories.

Beyond the authorship debate, we engaged with the risk of copyright infringement posed by AI-generated text. We noted that while some AI-generated content is derived from public domain material, many of the texts ingested in training sets remain under copyright. The creation of these datasets reflects the collective labor of countless individuals, often incorporated without consent or compensation. Though not always amounting to literal plagiarism, LLMs may produce outputs that borrow or rephrase protected material, exposing developers to claims of infringement. Our article underscores that US law remains unsettled. While courts like the one in Bartz recognize some AI training as fair use, others have dismissed infringement claims for failing to show substantial similarity between outputs and original works. Still, litigation is ongoing, and the broader doctrinal picture is far from clear. Thus, the questions raised in Bartz echo well beyond that case: Does the ingestion of copyrighted material by LLMs violate copyright law? Is the resulting output protected, infringing, or something in between? And as AI systems become more adept at mimicking human style and structure, should we recalibrate our understanding of creativity, originality, and fair use?

Legal Risk and Ethical AI: Next Steps for Boards

So what does this mean for governance? As generative AI technologies become increasingly sophisticated and commercially viable, governance practices must evolve in tandem. Here are several important takeaways from this case: First of all with regards to Risk Management, Boards should assess legal exposure tied to training datasets and licensing practices; Second, companies should create internal policies for ethical AI development, particularly around content ingestion;  Third, publicly traded firms may need to disclose AI-related IP risks in securities filings; and last, investors and audit committees should evaluate whether AI initiatives comply with evolving copyright norms.

In conclusion, we believe that the Bartz case is not the end of the story. The case offers a temporary equilibrium, a jurisprudential stopgap that invites both celebration and scrutiny. But the clock is ticking. As generative AI continues to evolve, so must our legal frameworks. The question is not just what the law is, but what it ought to become.

 

Hadar Y. Jabotinsky is the Founder and Head of The Hadar Jabotinsky Center for Interdisciplinary Research of Financial Markets, Crises and Technology. 

Michal Lavi is a Senior Research Fellow at The Hadar Jabotinsky Center for Interdisciplinary Research of Financial Markets, Crises and Technology.