OpenAI and MidJourney are looking to buy WordPress and Tumblr data

February 28, 2024

AI tumblr wordpress

Automattic, the company behind WordPress and Tumblr, is discussing a data and content deal with MidJourney and OpenAI.

This information, initially covered by 404 Media and based on reports from an unnamed source within Automattic, indicates that an agreement with OpenAI and MidJourney could be imminent.  

This follows rumors circulating on Tumblr about a potential deal with MidJourney that could introduce a new revenue stream for the platform.

404 says the deal process has been messy thus far, including a partially failed data transfer to OpenAI and MidJourney that contained, in one of Tumblr’s product managers’ words:

“Private posts on public blogs, posts on deleted or suspended blogs, unanswered asks (normally these are not public until they’re answered), private answers (these only show up to the receiver and are not public), posts that are marked ‘explicit’ / NSFW / ‘mature’ by our more modern standards (this may not be a big deal, I don’t know).”

The implications of this remain unclear and further details of the deal are forthcoming.

The gold rush for AI training data moves up a notch

And just like that, the gold rush for AI training data has moved up a gear. 

Yes, generative AI companies have always needed vast quantities of data – but they’re now rushing to pay for it rather than scrape it for free. 

Just days ago, Reddit reportedly discussed licensing its vast array of user-generated content to a yet-to-be-revealed AI company, a deal that could be worth around $60 million annually. This emerges as Reddit gears up for a public offering in March, aiming for a valuation close to $5 billion.

This potential licensing agreement aligns with a growing trend among tech companies to secure legitimate data use agreements, especially in the face of increasing copyright risks.

Ongoing legal battles, such as the New York Times lawsuit, have dialed up the urgency for content deals. 

Automattic’s move to negotiate with AI companies raises questions about using user-generated content for AI training.

They’ve allegedly announced plans to introduce a new feature that allows users to opt out of having their data shared with third parties, including AI firms. 

Automattic made a public statement published following 404’s report, stating, “We currently block, by default, major AI platform crawlers — including ones from the biggest tech companies — and update our lists as new ones launch,” and “will share only public content that’s hosted on and Tumblr from sites that haven’t opted out.” 

It continues, “We are also working directly with select AI companies as long as their plans align with what our community cares about: attribution, opt-outs, and control.”

However, opting out of having your information used for AI training could penalize users’ accounts.

A new yet-posted FAQ entitled “What happens when you opt out?” states, “If you opt-out from the start, we will block crawlers from accessing your content by adding your site to a disallowed list. If you change your mind later, we also plan to update any partners about people who newly opt-out and ask that their content be removed from past sources and future training.”

We’re now living in a world where anything you’ve posted on the internet could be sold for AI training purposes – if it’s not taken for free.

And as AI evolves, the debate over data use and privacy will likely intensify.

Companies who own data goldmines stand to win big, but at what cost to the average internet user?

Join The Future


Clear, concise, comprehensive. Get a grip on AI developments with DailyAI

Sam Jeans

Sam is a science and technology writer who has worked in various AI startups. When he’s not writing, he can be found reading medical journals or digging through boxes of vinyl records.


Stay Ahead with DailyAI

Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.

*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions