OpenAI published a blog post on Jan. 8 in which it responded to an ongoing lawsuit previously initiated by The New York Times.
On Dec. 27, the newspaper alleged in a lawsuit that OpenAI violated copyright law and used millions of its articles to train automated chatbots.
OpenAI, however, claims that The New York Times has omitted key details from its account of events leading up to the lawsuit. Notably, OpenAI said that it and the newspaper had discussed cooperation prior to the lawsuit. It wrote:
“Our discussions with The New York Times had appeared to be progressing constructively through our last communication on [Dec. 19]. The negotiations focused on a high-value partnership around real-time display with attribution in ChatGPT, in which The New York Times would gain a new way to connect with their existing and new readers, and our users would gain access to their reporting.”
According to OpenAI, The New York Times had mentioned problems concerning regurgitation — that is, output virtually unchanged from the source data — but did not share any instances of this. OpenAI added that it believes the regurgitated material originated from articles that are now several years old and which were previously republished on third-party websites.
OpenAI additionally alleged that the New York Times deliberately manipulated prompts to produce the regurgitated material. It insisted that this type of activity is both unusual and not allowed, then described progress on the problem.
OpenAI highlighted broader copyright efforts
OpenAI otherwise said said that it collaborates with news organizations. It noted that it has entered partnerships with groups including the Associated Press, Axel Springer, the American Journalism Project and New York University.
It also asserted that training efforts that fall outside those agreements are “fair use.” However, despite maintaining it has a legal right to access and use that material, OpenAI said that it provides an opt-out process on principle. It said that the New York Times used the opt-out process to exclude itself in August 2023.
Incidentally, The Guardian highlighted a submission from OpenAI to the U.K. Parliament on Jan. 8. In that statement, originally from December, OpenAI said that AI training is impossible unless it has access to copyrighted material.