Ten blog korzysta z plików cookies na zasadach określonych here
Close
26.06.2025

NEW TECH & INNOVATIONS

Between Protecting Creative Works and User Privacy – On the Implications of the NYT v. OpenAI Case

In recent weeks, the New York Times (NYT) lawsuit against Microsoft and OpenAI has once again captured the attention of both the technology and legal communities. In proceedings that have been ongoing for nearly two years, on May 13, 2025, a judge from the U.S. District Court for the Southern District of New York issued an order requiring OpenAI to preserve and segregate all output data—i.e., users’ chat histories—including those that would otherwise be deleted under standard data retention policies. The injunctive relief granted in this case may carry significant implications regarding potential violations of the General Data Protection Regulation (GDPR).

Background of the Dispute

The New York Times v. Microsoft et al. is one of the so-called “Newspaper Cases.” Two other prominent lawsuits in this category include Daily News v. Microsoft and Center for Investigative Reporting v. Microsoft. In these cases, the plaintiffs allege copyright infringement through the unauthorized and unlicensed use of copyrighted texts as training data for large language models (LLMs) developed by the defendants.

Injunctive Relief Granted

On May 13, 2025, the District Court judge issued an interim measure ordering the indefinite retention of users’ historical chat data at the plaintiff’s request. The order overrides OpenAI’s existing data retention policy, which previously allowed chat histories deleted by users to remain on OpenAI’s servers for only 30 days.

The stated aim of the order is to secure relevant evidence necessary for fair judicial proceedings. In the judge’s view, OpenAI would likely continue deleting potentially crucial evidence relevant to the NYT’s claims without such a measure. The court also acknowledged the plaintiff’s argument that some OpenAI users may input copyrighted articles as prompts and request the model to regenerate those articles verbatim—thereby infringing the copyright held by the New York Times.

It is worth noting that the judge explicitly ruled out the deletion of such data despite “various privacy rights and regulations,” implying that, in her assessment, the plaintiff’s copyright claims take precedence over user privacy rights.

The data preservation order applies to the following OpenAI products:

  • ChatGPT Free
  • ChatGPT Plus
  • ChatGPT Pro
  • OpenAI API Services

The following products are excluded from the scope of the court order:

  • ChatGPT for Enterprise
  • ChatGPT for Education
  • Accounts with a so-called “Zero Data Retention” agreement with OpenAI

Implications for Data Protection

The General Data Protection Regulation (GDPR) grants data subjects the right to be forgotten. Upon request, the data controller—in this case, OpenAI—must erase the individual’s personal data without undue delay. The one exception foreseen under the Regulation is where the data must be retained for ongoing legal proceedings. However, whether this exception extends to lawsuits brought outside the European Union remains legally contentious. As a result, the New York court’s order may in effect compel OpenAI to breach obligations imposed by the GDPR.

The disregard by U.S. courts for foreign data protection laws is not new. In Société Nationale Industrielle Aérospatiale v. U.S. District Court (1987), the U.S. Supreme Court held that American courts are not bound by foreign privacy laws when issuing discovery orders or injunctive relief.

Broader Context

The NYT lawsuit is just one of many ongoing cases in the United States alleging that large language models infringe copyright laws. It forms part of a broader debate concerning fair compensation for authors whose creative works are used to train AI models.

Commentary

Preservation orders are a well-established legal tool for securing potential evidence during litigation. However, in the context of AI systems, which process vast volumes of information, such orders warrant particular scrutiny. Users increasingly employ generative AI to process sensitive and highly personal queries—ranging from private, familial, or health-related matters. Such prompts may involve sensitive personal data, which, under the GDPR, qualifies for heightened protection.

The NYT case presents a legal and ethical dilemma: on the one hand, the rights of creators to protect their intellectual property; on the other, the obligations of data controllers—such as data minimization and storage limitation. Balancing these competing interests poses a complex challenge, which should ideally be addressed during the design phase of AI-based systems.

It remains to be seen how this precedent may influence future regulations on using copyrighted content as training data for AI models.

Sources:

#artificial intelligence #copyright law #data protection #New York Times #OpenAI

Chcesz być informowany
o najnowszych wpisach na blogu?

  • - Just provide your e-mail address and receive notifications about the latest posts on the SKP/IPblog blog directly to your inbox
  • - We will not send you spam messages

The administrator of your personal data is a SKP Ślusarek Kubiak Pieczyk sp.k. with its registered office in Warsaw, at ul. Ks. Skorupki 5, 00-546 Warszawa.

We respect your privacy, therefore the data provided to us will not be processed and made available outside the SKP for purposes other than those included in the Terms of Service. Detailed provisions regarding our IP Blog, including a catalog of your rights related to the processing of personal data, can be found in the Privacy Policy.