A dismissed lawsuit brought by a group of artists against Stability AI and others has been resubmitted with seven additional plaintiffs added.
The original lawsuit had its claims of copyright infringement dismissed in October by Judge William Orrick who disagreed that generating similar images was the same as copyright infringement.
While dismissing this aspect of the claim, Orrick said the plaintiffs could resubmit an amended suit in which they’d need to be more convincing in their allegations against Stability AI, Midjourney, DeviantArt, and Runway.
The original plaintiffs were artists Sarah Anderson, Kelly McKernan, and Karla Ortiz. They are now joined by H. Southworth, Grzegorz Rutkowski, Gregory Manchess, Gerald Brom, Jingna Zhang, Julia Kaye, and Adam Ellis.
The resubmitted suit contains some interesting arguments that, if accepted by the court, will have significant ramifications for all generative AI models trained using copyrighted data.
CEO of Stability AI Emad Mostaque is known for speaking off the cuff and often making bold or sensationalist claims. Many of his words are quoted as evidence against his company in the filing.
You can read the amended lawsuit filings here, but here are some notable highlights.
Can it really make a copy of the original?
The previous copyright claims were dismissed partly because the judge ruled that generating a similar image was not the same as producing a copy of an image.
The new submission says these models can duplicate their training data and quotes Mostaque saying, “We took 100,000 gigabytes of images and compressed it to a two-gigabyte file that can recreate any of those [images] and iterations of those.”
It further quotes Mostaque saying, “Ironically [the] main funding of stability except me is … artists…LOL”
The new filing asserts that “Copyright law protects artists’ works from infringement by creating exclusive rights of artists to make copies of their works, to make derivative works of their copyrighted works.”
The word “derivative” is key here. If you ask Midjourney to generate an image in the style of Kelly McKernan have you and Midjourney violated McKernan’s copyright? You’ve got to feel for the artist when this happens:
The first image showing up under my name is AI. Screaming, crying, throwing up in my heart. pic.twitter.com/GgsoEVLj3o
— Kelly McKernan (@Kelly_McKernan) November 29, 2023
The LAION datasets
LAION created the LAION-5B dataset of billions of images. The dataset doesn’t contain copies of images but rather the URLs of the image locations. To train your model you need to fetch those images and LAION created software to enable you to do that.
The lawsuit claims LAION enabled copyright breach by doing this and that Stability AI was complicit. In August 2022, Stability CEO Mostaque said “I funded LAION, underlying dataset for … Stable Diffusion.” Mostaque has since denied this.
LAION datasets are key to most AI image generators. If using the LAION-5B dataset is considered copyright infringement then that shuts down most of your favorite image generators.
Was GPT-4 Vision trained on LAION datasets? OpenAI isn’t saying but it could become a big issue for them too.
Does Stable Diffusion store copies of images?
In the initial lawsuit, Stability AI lawyers claimed it was wrong to say the Stable Diffusion model contained copies of the images on which it was trained. They claimed that it was impossible for billions of images to be stored in a model around 2GB big.
The new filing insists that through machine learning Stable Diffusion does, in a novel way, store compressed copies of copyrighted images.
The amended submission quotes Mostaque describing Stable Diffusion in a recorded interview from August 2022:
“It’s worth taking a step back and thinking about how crazy insane this is: we took a hundred terabytes of data—a hundred thousand thousand megabytes of images—2 billion of them—and we squished it down to a 2–4 gigabyte file. And that file can create everything that you’ve seen. That’s insane, right? That’s about as compressed as you can get.”
A significant case for Gen AI
There are several other significant issues raised in the suit that, once ruled on, will impact artists and the companies behind the AI models that generate your images.
It feels a bit like there’s no way to put this AI Genie back in the copyright bottle. Will it be a case of class action mea culpa reparations being paid by Stability AI and other companies like them?
Or perhaps the latitude for “fair use” will be expanded to accommodate AI realities that didn’t exist a year ago.
Either way, this lawsuit is going to have dramatic effects on the industry and livelihoods of artists. It may also make Emad Mostaque choose his words more carefully in the future.