5 / 283

Encyclopedia Britannica is suing OpenAI for allegedly ‘memorizing’ its content with ChatGPT

TL;DR

Encyclopedia Britannica and Merriam-Webster have sued OpenAI, alleging the company used their copyrighted content without permission to train GPT-4.

Key Points

  • Britannica claims GPT-4 has 'memorized' large portions of its content and can reproduce near-verbatim copies on demand.
  • The plaintiffs argue the model outputs are 'substantially similar' to their original texts, constituting copyright infringement.
  • Both publishers are among the most established reference works in the English-speaking world, giving the case significant symbolic weight.

Nauti's Take

The 'memorization' argument is legally clever because it shifts the focus from the training process to the output – and that is where infringement can actually be demonstrated concretely. Britannica apparently did its homework, systematically prompting GPT-4 with queries about its own content.

If the court agrees, 'how similar is the output to the original? ' becomes the central compliance question for every AI provider.

Retrieval Augmented Generation suddenly looks less like a technical choice and more like a legal necessity.

Sources