A group of five US authors have filed separate class action lawsuits against OpenAI and Meta, adding to a growing list of legal battles OpenAI is already facing.
The authors, Michael Chabon, David Henry Hwang, Matthew Klam, Rachel Louise Snyder, and Ayelet Waldman, claim that OpenAI and Meta infringed on their copyright while training their respective GPT and Llama AI models.
These lawsuits come as OpenAI is already fighting two other lawsuits by writers claiming that they suffered similarly by having their work used without permission.
OpenAI lawyers are yet to respond to this latest lawsuit, but their arguments continue to insist that the company’s use of copyrighted material was legal under the principle of fair use.
OpenAI has not conceded that it included the copyrighted works in ChatGPT’s training datasets, but it also hasn’t denied it.
I’ve joined a lawsuit against OpenAI (owner of ChatGPT) over copyright infringement. This will, I hope, put some boundaries on AI in order to protect writers not only for my generation, but for many to come. @LauraDeNardis @nytimesbooks https://t.co/TM4jhXzuSk
— Rachel Louise Snyder (@RLSWrites) September 12, 2023
It’s obvious that they absolutely did use these published works though.
If you ask ChatGPT for a summary of Pulitzer Prize winner Michael Chabon’s book ‘The Amazing Adventures of Kavalier & Clay’ it happily provides it.
If you prompt it for an excerpt from the book ChatGPT responds by saying, “I’m sorry, but I can’t provide verbatim excerpts from copyrighted texts.”
That seems fair enough. Other companies publish summaries or book reviews without falling foul of copyright laws. And it’s in an author’s interest for people to get to know their works, right?
However, the authors claim that besides making money from their work with the paid GPT Plus product, OpenAI’s models infringe on their rights in another significant way.
The legal filing in OpenAI’s case states that “if ChatGPT is prompted to generate a writing in the style of a certain author, GPT would generate content based on patterns and connections it learned from analysis of that author’s work within its training dataset.”
That’s a fair point. If you ask ChatGPT to write a paragraph in the style of the book it does a great job. Here’s an excerpt of what it wrote:
“In the dimly lit room, surrounded by stacks of paper and half-finished sketches, Sam Clay and Josef Kavalier hunched over their drawing table, their creative energies flowing like a wellspring of imagination. Outside, the world churned with the rumblings of a war that seemed to stretch endlessly across continents.”
That’s pretty good. Now the fair use argument looks a little more tenuous and the potential harm to the authors becomes clearer.
A person could get ChatGPT to write an entire novel in the style of a famous author. But not everyone thinks that it’s a legal issue.
In July, Kent Walker, Google’s president of global affairs told The Washington Post, “The AI models are basically learning from all of the information that’s out there. It’s akin to a student going and reading books in a library and then learning how to write and read.”
The legal filing against Meta seems to argue points that are similar to those in OpenAI’s earlier lawsuits. OpenAI has called for most of those arguments to be dismissed but intends to argue specific points on the issue of copyright and training data.
Meta and OpenAI are by no means the only AI developers that used questionable sources for AI training data. They happen to be at the front of the legal firing line but companies like Microsoft, Google, and Stability AI, will be keeping a close eye on the proceedings.