Wednesday, August 24, 2022

Geometry-enhanced molecular representation learning for property prediction

This paper seems to be very interesting and promising! Previous graph neural networks focused more on the atoms and chemical bonds making up molecules, but not their geometric orientation and properties.

"... proposed geometry-enhanced molecular representation learning (GEM), an architecture and training method that classifies molecules and estimates their properties.
Key insight: Chemists have used graph neural networks (GNNs) to analyze molecules based on their atomic ingredients and the types of bonds between the atoms. However, these models weren’t trained on structural information, which plays a key role in determining a molecule’s behavior. They can be improved by training on structural features such as the distances between atoms and angles formed by their bonds. 
GNN basics: A GNN processes datasets in the form of graphs, which consist of nodes connected by edges. For example, a graph might depict customers and products as nodes and purchases as edges. This work used a vanilla neural network to update the representation of each node based on the representations of neighboring nodes and edges.
How it works: The authors trained a modified GNN on 18 million molecules whose properties were unlabeled to estimate structural attributes of molecules. ..."

From the abstract:
"Effective molecular representation learning is of great importance to facilitate molecular property prediction. Recent advances for molecular representation learning have shown great promise in applying graph neural networks to model molecules. Moreover, a few recent studies design self-supervised learning methods for molecular representation to address insufficient labelled molecules; however, these self-supervised frameworks treat the molecules as topological graphs without fully utilizing the molecular geometry information. The molecular geometry, also known as the three-dimensional spatial structure of a molecule, is critical for determining molecular properties. To this end, we propose a novel geometry-enhanced molecular representation learning method (GEM). The proposed GEM has a specially designed geometry-based graph neural network architecture as well as several dedicated geometry-level self-supervised learning strategies to learn the molecular geometry knowledge. We compare GEM with various state-of-the-art baselines on different benchmarks and show that it can considerably outperform them all, demonstrating the superiority of the proposed method."

Geometry-enhanced molecular representation learning for property prediction | Nature Machine Intelligence (open access)


Fig. 1: Comparison between two stereoisomers with the same topology but different geometries

Fig. 2: Overall architecture of GEM



No comments: