The New York Times v. OpenAI & Microsoft · editorial illustration
Background
I · The factsThe highest-profile AI copyright suit. The Times presents 100+ examples of verbatim reproduction by GPT-4 and argues OpenAI's fair-use defense collapses when outputs substitute for the original. Discovery disputes over training logs and deletion practices have dominated 2025.
“Defendants' tools trained on millions of copyrighted Times works, then produced verbatim or near-verbatim output that competes directly with the Times — the fourth fair-use factor, market impact, is dispositive.”
— Complaint, 1:23-cv-11195, ¶ 96
The Claims
II · What the plaintiff argues- Direct Copyright Infringement. Unauthorized reproduction of millions of Times articles in training data and in model outputs.
- Contributory Infringement. Microsoft's commercial integration of OpenAI's technology in Bing, Copilot, and Azure.
- DMCA § 1202. Removal of bylines, copyright notices, and paywall identifiers during ingestion.
- Unfair Competition. State-law claims that the products free-ride on Times journalism to build competing information services.
The Defense
III · What the defendant arguesOpenAI argues that training is transformative and that the verbatim-output examples in the complaint required adversarial prompting — they don't reflect ordinary use. Microsoft argues its use of OpenAI models is a protected software integration, not direct infringement of Times content.
Analysis
IV · Our readThe Times' 100+ example appendix is the most consequential exhibit in AI copyright litigation. It shifts the discussion from "could this theoretically reproduce" to "here are the actual reproductions — explain."
OpenAI's "adversarial prompting" defense is a double-edged sword. If the verbatim reproduction required tricks, that arguably weakens the infringement claim. But it also admits the model memorized the content, which is fatal to the "mathematical abstraction" framing.
Discovery disputes over training logs and deletion practices have dominated 2025. The Times has aggressively litigated OpenAI's practice of deleting training artifacts, arguing spoliation.
A Times win reshapes the business model of generative AI — content licensing becomes a line item. A Times loss on fair use, combined with a win on § 1202, still results in hundreds of millions in statutory damages.
What this means for you
This case has the highest probability of a jury verdict on all four fair-use factors. Whatever the outcome, it will be the precedent cited in every subsequent AI training case in the United States. Detection tooling becomes legally consequential: if courts rule AI-generated text infringes, the demand for reliable detection becomes a compliance requirement, not a nice-to-have.
See our compliance guidesTimeline
V · Key datesFiled in S.D. New York with 100+ example appendix.
OpenAI moves to dismiss; motion denied as to primary copyright claims.
Parties exchange training data documentation under strict protective order.
Times files motion alleging deletion of training artifacts; resolution pending.
Cross-motions on fair use filed.
Jury trial anticipated late 2026.