close
close

(2312.17495) Integrating chemical language and molecular graph into multimodal fused deep learning for drug property prediction

View a PDF of the article titled Integrating Chemical Language and Molecular Graph in Multimodal Fused Deep Learning for Drug Property Prediction, by Xiaohua Lu and 5 other authors

View PDF

Abstract:Accurately predicting molecular properties is a challenging but essential task in drug discovery. Recently, many monomodal deep learning methods have been successfully applied to molecular property prediction. However, the inherent limitation of monomodal learning comes from relying exclusively on one modality of molecular representation, which limits a comprehensive understanding of drug molecules and hampers their resilience to data noise. To overcome the limitations, we construct multimodal deep learning models to cover different molecular representations. We convert drug molecules into three molecular representations, SMILES-encoded vectors, ECFP fingerprints and molecular graphs. To process the modal information, Transformer-Encoder, bidirectional gated recurrent units (BiGRU) and graph convolutional network (GCN) are respectively used for feature learning, which can improve the ability of the model to acquire complementary and naturally occurring bioinformatics information. We evaluated our triple-modal model on six molecule datasets. Different from bimodal learning models, we apply five fusion methods to capture the specific features and better utilize the contribution of each modal information. Compared with monomodal models, our multimodal fused deep learning (MMFDL) models outperform single models in accuracy, reliability, and noise resilience. Moreover, we demonstrate the generalization ability in the prediction of binding constants for protein-ligand complex molecules in the refined set of PDBbind. The advantage of the multimodal model lies in its ability to handle diverse sources of data by using appropriate models and suitable fusion methods, which would improve the noise resilience of the model while obtaining data diversity.

Submission History

From: Liangxu Xie (view email)
(v1)
Fri Dec 29, 2023 07:19:42 UTC (1,666 KB)
(v2)
Fri, 13 Sep 2024 01:05:46 UTC (9,908 KB)