Encyclopaedia Britannica and Merriam-Webster Take on OpenAI in Landmark AI Copyright Suit

Britannica argues generative AI doesn’t just learn from content—it competes with it.

MITSloan ME Editorial 10 minutes ago

Topics

Encyclopaedia Britannica and its subsidiary, Merriam-Webster, have filed a lawsuit against OpenAI in Manhattan federal court, adding to a growing body of litigation challenging how generative AI systems are trained.

The complaint alleges that OpenAI used Britannica’s encyclopedia entries and dictionary definitions—amounting to nearly 100,000 articles—without authorization to train its large language models (LLMs), including ChatGPT. In doing so, Britannica argues, the company not only reproduced protected material but also undermined its core business by diverting web traffic through AI-generated summaries.

At the center of the dispute is a familiar tension: whether training AI models on copyrighted material constitutes fair use or infringement. OpenAI maintains that its systems rely on publicly available data and transform that data into new outputs, a position that has become the industry’s standard legal defense.

“Our models empower innovation, and are trained on publicly available data and grounded in fair use,” an OpenAI spokesperson said in response to the lawsuit. Courts, however, have yet to definitively resolve how this doctrine applies at scale in the context of generative AI.

Britannica’s claims go beyond data usage. The company alleges that ChatGPT can produce “near-verbatim” excerpts of its content, raising questions about the extent to which generative models truly transform their training data. It also accuses OpenAI of trademark infringement, arguing that the chatbot sometimes attributes information to Britannica in ways that imply authorization or produce misleading “hallucinated” citations. Such behavior, the complaint suggests, risks eroding both brand integrity and user trust.

This case is not an isolated incident but part of a broader legal pushback from content owners, including publishers, authors, and media organizations. Britannica itself has previously sued Perplexity AI on similar grounds. Collectively, these cases signal an emerging effort to redefine the boundaries of intellectual property in the age of machine learning.

Despite the high stakes, judicial responses so far have been cautious and incremental. Few rulings have imposed meaningful constraints on AI companies, leaving key questions unresolved: Does large-scale data scraping for model training qualify as transformative use? And if so, at what point does output similarity cross into infringement?

As generative AI systems become more embedded in information ecosystems, the outcome of such cases could reshape not only legal doctrine but also the economics of knowledge production. For legacy reference institutions like Britannica, the issue is existential; for AI developers, it is foundational.

Topics

About the Author

Tags:

AI Lawsuit OpenAI

Topics

Share