Researchers explain the 'black box' nature of machine learning applications in drug research

Artificial intelligence (AI) is developing at a rapid pace, but its inner working principles are often vague and "black box" in nature, making it impossible to see the process of reaching conclusions. However, Professor Jürgen Bajorath, a chemical informatics expert at the University of Bonn, and his team made a major breakthrough. They have designed a technique that can reveal the workings of some artificial intelligence systems used in pharmaceutical research.

Surprisingly, their results showed that these AI models relied primarily on recalling existing data rather than learning specific chemical interactions to predict a drug's effectiveness. Their research results were recently published in Nature Machine Intelligence.

Which drug molecule is most effective? Researchers are frantically searching for effective active substances to fight the disease. These compounds typically dock with proteins, often enzymes or receptors, that trigger a specific cascade of physiological effects.

In some cases, certain molecules can also prevent adverse reactions in the body, such as excessive inflammation. Due to the vast variety of existing compounds, at first glance this research may seem like looking for a needle in a haystack. Drug discovery therefore attempts to use scientific models to predict which molecules will best dock and bind strongly to the corresponding target proteins. These potential drug candidates are then investigated in more detail in experimental studies.

Relative proportions of edges in protein-ligand interaction graphs - determination of six GNN predictions for different affinity subregions. Colored bars compare the average proportion of protein, ligand, and interaction edges among the top 25 edges for each prediction determined using EdgeSHAPer. Image source: A.Mastropietro and J.Bajorath

As artificial intelligence develops, machine learning applications are increasingly used in drug discovery research. Among them, "Graph Neural Networks" (GNN) provide multiple opportunities for such applications. For example, they are suitable for predicting the binding strength of a certain molecule to a target protein. For this purpose, GNN models are trained using graphs representing complexes formed between proteins and compounds (ligands).

Graphs generally consist of nodes representing objects and edges representing relationships between nodes. In graph representations of protein-ligand complexes, edges connect only protein or ligand nodes, representing their structures respectively, or connect protein and ligand nodes, representing specific protein-ligand interactions.

Professor Jürgen Bajorath said: "How GNN derives prediction results is like a black box that we cannot peek into." This professor from the LIMES Institute of the University of Bonn and the Bonn-Aachen International Center for Information Technology (B-IT) ) and cheminformatics researchers at the Lamar Institute for Machine Learning and Artificial Intelligence in Bonn, together with colleagues at Sapienza University in Rome, analyzed in detail whether graph neural networks can actually learn the interactions between proteins and ligands and thus predict the binding strength of active substances to target proteins.

How do AI applications work?

The researchers analyzed a total of six different graph neural network architectures using a specially developed "EdgeSHAPer" method and a conceptually different comparison method. These computer programs "screen" whether the GNN learned the most important interactions between the compound and the protein, predicting the potency of the ligand, as the researchers intended and expected, or whether the AI arrived at its predictions in other ways.

Professor Dr. Jürgen Bajorath - from the LIMES Institute of the University of Bonn, the International Center for Information Technology Bonn-Aachen (B-IT) and the Lamar Institute for Machine Learning and Artificial Intelligence. Source: University of Bonn

"GNNs are very dependent on the data they are trained on," said Andrea Mastropietro, the study's first author and a doctoral student at the University of Rome Sapienza.

The scientists trained six GNNs with patterns extracted from the structures of protein-ligand complexes whose modes of action and the compound's binding strength to the target protein were experimentally known. The trained GNN is then tested on other composites. Subsequent EdgeSHAPer analysis allowed us to understand how the GNN produced apparently promising predictions.

Professor Bajorath explained: "If GNNs achieve the desired effect, then they need to learn the interaction between the compound and the target protein and determine the predicted outcome by prioritizing specific interactions. However, according to the research team's analysis, these six GNNs basically failed to do this. Most GNNs only A number of protein-drug interactions have been solved, mostly focusing on ligands. To predict the binding strength of a molecule to a target protein, the models primarily 'memorize' the chemically similar molecules they encountered during training and their binding data, regardless of the target protein. These learned chemical similarities then essentially determine the prediction results."

Scientists believe this is largely reminiscent of the "Smart Hans Effect." This effect refers to a horse that can count. The frequency with which Hans tapped his hoof should have shown the result of the calculation. But it later emerged that the horse wasn't calculating at all, but was instead inferring the expected outcome based on subtle differences in its companion's facial expressions and gestures.

What do these findings mean for drug discovery research? The cheminformatician said: "In general, GNN learning of chemical interactions between active substances and proteins is untenable. Their predictions are largely overestimated, because predictions of the same quality can be made using chemical knowledge and simpler methods. However, this research also provides opportunities for artificial intelligence. Among the models examined by GNN, two models showed a clear trend that they learn more interactions as the potency of the test compound increases." It is worth taking a closer look here. Perhaps by modifying the representation and training techniques, these GNNs can be further improved in the desired direction. However, the assumption that physical quantities can be learned from molecular graphs should generally be treated with caution. Artificial intelligence is not black magic. "

More light in the darkness of artificial intelligence

In fact, in his view, previous public releases of EdgeSHAPer and other specially developed analysis tools are promising ways to reveal the black box of AI models. His team's current work is focused on GNNs and new "chemical language models."

"Developing methods to explain the predictions of complex models is an important area of artificial intelligence research. There are also methods for other network architectures such as language models, which help to better understand how machine learning reaches its results." He hopes that the Lamar Institute will also soon achieve exciting results in the field of "explainable artificial intelligence."

Reference: "Learning Characteristics of Graph Neural Networks for Predicting Protein Ligand Affinity" by Andrea Mastropietro, Giuseppe Pasculli, and Jürgen Bajorath, November 13, 2023, "Nature - Machine Intelligence".

DOI:10.1038/s42256-023-00756-9

Compiled source: ScitechDaily