The Financial Times and OpenAI strike deal over training data

April 29, 2024

  • The Financial Times has entered into a training data deal with OpenAI
  • ChatGPT will start returning direct quotes and summaries from FT articles
  • As part of the deal, OpenAI will also work with the FT to develop AI tools
AI data

The FT is granting OpenAI access to its news archives as generative AI companies continue to secure private data sources. 

The arrangement involves ChatGPT providing summaries, direct quotes, and hyperlinks to full articles published by the FT, linking directly back to the original content on the website.

As part of the deal, OpenAI has committed to collaborating with the FT to develop new AI-driven products. The FT is a ChatGPT Enterprise customer and has experimented with AI before, incorporating Anthropic‘s Claude into a generative search tool called “Ask FT.”

John Ridding, FT CEO, stated of the deal, “Apart from the benefits to the FT, there are broader implications for the industry. It’s right, of course, that AI platforms pay publishers for the use of their material. OpenAI understands the importance of transparency, attribution, and compensation – all essential for us.” 

A fair sentiment – though some would disagree that OpenAI understands the importance of transparency and attribution. 

Brad Lightcap, COO of OpenAI, also chimed in: “Our partnership and ongoing dialogue with the Financial Times is about finding creative and productive ways for AI to empower news organizations and journalists, and enrich the ChatGPT experience with real-time, world-class journalism for millions of people around the world.”

While access to the FT’s data is valuable for OpenAI, its current datasets consist of trillions of words of dubiously ‘public’ or ‘open source’ data.

Deals with the FT and other media companies like Axel Springer come as AI companies acknowledge they need to start paying for data to address growing legal pressures. They’ve also become acutely aware that their models will quickly become outdated without fresh, high-quality data.

AI companies battle over data

The ethical stakes in AI data usage are immense. In their endless quest for data, tech giants like OpenAI, Google, and Meta have been reported to engage in practices that push or outright cross legal and ethical boundaries. 

For instance, a New York Times investigation revealed that OpenAI developed a tool named Whisper to transcribe YouTube videos – despite potential violations of YouTube’s policies against using its videos for independent applications.

Similarly, Google and Meta have explored or implemented strategies that skirt or reinterpret existing copyright and privacy laws to gather more data. 

Among the shadier strategies are altering privacy policies to allow AI applications to use publicly available content from platforms like Google Docs. 

While AI companies are willing to pay for data now, that doesn’t spare them bending the rules elsewhere.

Join The Future


SUBSCRIBE TODAY

Clear, concise, comprehensive. Get a grip on AI developments with DailyAI

Sam Jeans

Sam is a science and technology writer who has worked in various AI startups. When he’s not writing, he can be found reading medical journals or digging through boxes of vinyl records.

×

FREE PDF EXCLUSIVE
Stay Ahead with DailyAI

Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.

*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions