Researchers aim to determine how ‘black box’ drug discovery models work

January 1, 2024

drugs AI

Scientists at the University of Bonn, led by Professor Dr. Jürgen Bajorath, have uncovered the inner workings of ‘black box’ AIs involved in pharmaceutical research. 

Their study, recently published in Nature Machine Intelligence, reveals that AI models in drug discovery predominantly depend on recalling existing data rather than learning new chemical interactions. This challenges previous assumptions about how AI makes predictions in this field.

Researchers use machine learning to locate molecules that effectively interact with target proteins, which often involves predicting which molecules will strongly bind to target proteins, followed by experimental validation of these predictions.

This form of AI-assisted drug discovery saw major breakthroughs in 2023, including an MIT-developed model that analyzed millions of compounds for potential therapeutic effects, AI-discovered drugs showing promise in slowing aging, and AI-generated proteins showing excellent binding strength.

The question Bajorath and the team sought to answer is, how do some of these models achieve their results?

The research team focused on Graph Neural Networks (GNNs), a type of machine learning application widely used in drug discovery. GNNs are trained using graphs that represent potential drug interactions. 

However, as Prof. Bajorath points out, “How GNNs arrive at their predictions is like a black box we can’t glimpse into.”

To comprehensively demystify this process, the team analyzed six different GNN architectures. Andrea Mastropietro, study author and a PhD candidate at Sapienza University in Rome, states, “The GNNs are very dependent on the data they are trained with.”

The researchers discovered that the GNNs predominantly rely on chemical similarities from their training data to make predictions rather than learning specific interactions between compounds and proteins.

This essentially means the AI models often “remember” rather than “learn” new interactions.

The “Clever Hans Effect” in AI

Researchers liken this phenomenon to the “Clever Hans effect,” where a horse seems to perform arithmetic by interpreting subtle cues from its handler rather than actually understanding mathematics. 

Similarly, the AI’s predictions are more about recalling known data than understanding complex chemical interactions.

The findings suggest that GNNs’ capability to learn chemical interactions is overestimated, and simpler methods might be equally effective. 

However, some GNNs showed potential in learning more interactions, indicating that improved training techniques could enhance their performance.

Prof. Bajorath’s team is also developing methods to clarify AI model functionality in pursuit of “Explainable AI,” an emerging field to make AI’s decision-making processes transparent and understandable.

Join The Future


Clear, concise, comprehensive. Get a grip on AI developments with DailyAI

Sam Jeans

Sam is a science and technology writer who has worked in various AI startups. When he’s not writing, he can be found reading medical journals or digging through boxes of vinyl records.


Stay Ahead with DailyAI

Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.

*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions