OpenAI has publicly responded to the copyright infringement lawsuit that The New York Times initiated in December.
The statement isn’t likely to result in the withdrawal of the lawsuit but it gives an insight into what lines of argument the company’s lawyers may follow.
Here’s a TLDR version of the four main points from the statement:
1. OpenAI says it aims to “support a healthy news ecosystem, be a good partner, and create mutually beneficial opportunities.”It says its products can help reporters and editors do their jobs better and reach their intended audiences in new ways.
In turn, it would like to use their data to train its models. OpenAI listed several media companies like Axel Springer with which it has entered into mutually beneficial relationships.
2. OpenAI still believes that training AI models with publicly available data is fair use and listed countries and organizations that agree.
OpenAI now provides a means to opt-out to block scraper bots but it doesn’t mention an option of removing historical training data before the opt-out feature.
3. Regurgitation of verbatim pieces of copyrighted content is a “rare bug” and OpenAI is working on fixing that. If content from The Times is syndicated and published on multiple platforms it’s bound to be reproduced by ChatGPT if users really try to get it to.
OpenAI says they expect users to “act responsibly” and not to do this. The New York Times content is such a tiny slice of ChatGPT’s training data so OpenAI says it doesn’t really move the needle much as far as training data goes.
4. OpenAI says The New York Times is not telling the full story. OpenAI thought their talks were progressing positively until they picked up a copy of The Times and learned about the legal action.
The examples of the regurgitated content were from old articles that were plastered all over the internet. OpenAI says the examples of the verbatim content were induced and suspects that “they either instructed the model to regurgitate or cherry-picked their examples from many attempts.”
The statement, which you can read in full here, ends with OpenAI expressing its hope that it can patch things up with paper. The alternative doesn’t look good for either side.