OpenAI outlines plans for responsible AI data usage and creator partnerships 

May 8, 2024

  • OpenAI has discussed changing its data scraping strategies in favor of creators
  • This includes a "Media Manager" platform for creators to interact with their data
  • Slated for 2025, some argue that it's a case of too little, too late
OpenAI art

OpenAI recently announced a new approach to data and AI, emphasizing the importance of responsible AI development and partnerships with creators and content owners. 

The company has declared to build AI systems that expand opportunities for everyone while respecting the choices of creators and publishers.

“AI should expand opportunities for everyone. By transforming information in new ways, AI systems help us solve problems and express ourselves,” OpenAI stated in its recent blog post

As part of this strategy, the company is developing a tool called Media Manager, poised to enable creators and content owners to specify how they want their works to be included or excluded from machine learning research and training. 

“Our goal is to have the tool in place by 2025, and we hope it will set a standard across the AI industry,” OpenAI stated.

There’s little information available about Media Manager and how it might work. It seems like it’ll take the form of a self-service tool where creators can identify and control their data.

Some speculate whether OpenAI will actively identify creators’ data inside their dataset using machine learning – which could be huge.

Ultimately, we don’t yet know how it’ll work or how effective it will be. 

A positive move from OpenAI? Possibly, but if OpenAI genuinely believes that training AI models on publicly available data falls under fair use, there would be no need for an opt-out option. 

Moreover, if OpenAI can develop tools to identify copyrighted material, it could probably use them to filter its data scraping from the outset rather than requiring content creators to opt out.

Plus, 2025 gives them enough time to build a colossal foundational dataset of people’s copyrighted works without their permission. 

From there, it’s primarily a matter of fine-tuning. OpenAI will continue to purchase data from sources like the Financial Times and Le Monde to keep their models up-to-date. 

This does, at least, serve as evidence that there’s pressure on OpenAI and other AI companies to handle data more ethically. 

Contributing to a desk full of lawsuits, European privacy advocacy group Noyb recently launched legal action at OpenAI, claiming that ChatGPT repeatedly generates inaccurate information about people and fails to correct it. 

OpenAI‘s response was a characteristic: ‘You might be right, but we can’t, or won’t, do anything about it.’

Join The Future


SUBSCRIBE TODAY

Clear, concise, comprehensive. Get a grip on AI developments with DailyAI

Sam Jeans

Sam is a science and technology writer who has worked in various AI startups. When he’s not writing, he can be found reading medical journals or digging through boxes of vinyl records.

×

FREE PDF EXCLUSIVE
Stay Ahead with DailyAI

Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.

*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions