DAI#45 – New top model, lawsuit blues, and puzzled AI

June 28, 2024

Welcome to this week’s roundup of hand-assembled bespoke AI news.

This week Anthropic knocked OpenAI off pole position.

AI audio generators face the music in court.

And the top LLMs struggle with a puzzle your kids can solve.

Let’s dig in.

Claude vs GPT-4o

After months of AI models claiming to be ‘almost as good as GPT-4’, we’ve finally got a model that pushes OpenAI off its top spot on the leaderboards.

Anthropic released Claude Sonnet 3.5, an upgraded version of its mid-size Claude model. The MMLU benchmark tests show it beating GPT-4o and Google’s Gemini 1.5 Pro in almost every test.

With an even more powerful Claude Opus 3.5 expected soon, what will OpenAI’s response be?

After Meta called off its launch of Meta AI in the EU, Apple is doing the same due to strict laws in the region.

Apple has delayed the rollout of its Apple Intelligence features there as EU tech fans watch the rest of the world get first dibs.

Sounds familiar…

AI companies are getting sued, and for a change, it’s not OpenAI or Meta.

Text-to-audio platforms Suno and Udio generate impressive music, but how did they get so good?

The Recording Industry Association of America is suing the companies, saying they “stole copyrighted sound recordings” to train their AI. When the judge listens to these sample clips it might be a short day in court.

An AI company using copyrighted material to train its models without paying the creators? We’re as unsurprised as you are.

Recreating copyrighted music isn’t the worst thing AI is being used for though. A DeepMind study says that the leading form of AI misuse is bad guys creating deep fakes for opinion manipulation.

The rest of the AI misuse list makes for interesting reading.

Are you sure that’s right?

AI models are really good at generating very plausible but completely wrong information.

AI scientists say hallucinations can’t be fixed but a University of Oxford study identified when AI hallucinations are more likely to occur.

“Semantic entropy” checks the AI model’s confidence level and it’s also my new polite way to say someone is talking BS.


Even the most advanced LLMs make stuff up when presented with surprisingly simple puzzles. This week users on X posted examples of how the smartest models can’t solve a simple river crossing puzzle.

Is it evidence that LLMs aren’t good at reasoning, or is something else happening here?

AI might struggle with some riddles but it knows you better than you think. A new study found that an AI system can predict how anxious you are from how you react to photos.

The ability of these models to infer human emotions could be very helpful, but might be a source of human anxiety too.

AI open season

When AI companies use the word “open” to describe their models it rarely means what you think it does.

How “open” are these AI models? Sam took a closer look at which AI models are truly open and why some companies keep certain aspects very much closed.

This week saw an exciting development in the open model space. EvolutionaryScale’s ESM3 is a generative model for biology that turns prompts into proteins.

Previously, scientists looking for a novel protein would have to wait for nature to come up with it or try a hit-or-miss approach in the lab.

Now ESM3 enables scientists to program biology and create proteins beyond nature.

AI events

If you want to level up your marketing efforts then check out the MarTech Summit Hong Kong 2024 happening on 9 July.

The AI Accelerator Institute presents the Generative AI Summit Austin 2024 on 10 July. The agenda sees industry leaders discuss the latest trends in real-world generative AI applications.

In other news…

Here are some other clickworthy AI stories we enjoyed this week:

And that’s a wrap.

Have you tried out the upgraded Claude? The Artifacts window is seriously cool. It’s a sure bet that ChatGPT will get a similar feature very soon.

I love playing with Udio and Suno but there’s no denying they rip off copyrighted music. Is this the price of progress or is it a showstopper?

I’m still surprised that AI models struggle with a simple river crossing puzzle. We should probably fix that before letting AI control really important stuff like power grids or hospitals.

Let us know what you think and keep sending us links to interesting AI news and research we may have missed.

Join The Future


Clear, concise, comprehensive. Get a grip on AI developments with DailyAI

Eugene van der Watt

Eugene comes from an electronic engineering background and loves all things tech. When he takes a break from consuming AI news you'll find him at the snooker table.

No categories found.

Stay Ahead with DailyAI

Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.

*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions