TY - JOUR
T1 - Structural Analysis and Prediction of Hematotoxicity Using Deep Learning Approaches
AU - Long, Teng-Zhi
AU - Shi, Shao-Hua
AU - Liu, Shao
AU - Lu, Ai-Ping
AU - Liu, Zhao-Qian
AU - Li, Min
AU - Hou, Ting-Jun
AU - Cao, Dong-Sheng
N1 - Funding Information:
This work was supported by the National Key Research and Development Program of China (2021YFF1201400), the National Natural Science Foundation of China (22173118, 22220102001), the Hunan Provincial Science Fund for Distinguished Young Scholars (2021JJ10068), the Science and Technology Innovation Program of Hunan Province (2021RC4011), the Natural Science Foundation of Hunan Province (2022JJ80104), the Changsha Science and Technology Bureau Project (kq2001034), and the 2020 Guangdong Provincial Science and Technology Innovation Strategy Special Fund (2020B1212030006, Guangdong-Hong Kong-Macau Joint Lab). The authors acknowledge Haikun Xu and the High-Performance Computing Center of Central South University for support. The study was approved by the university’s review board.
Publisher Copyright:
© 2022 American Chemical Society.
PY - 2023/1/9
Y1 - 2023/1/9
N2 - Hematotoxicity has been becoming a serious but overlooked toxicity in drug discovery. However, only a few in silico models have been reported for the prediction of hematotoxicity. In this study, we constructed a high-quality dataset comprising 759 hematotoxic compounds and 1623 nonhematotoxic compounds and then established a series of classification models based on a combination of seven machine learning (ML) algorithms and nine molecular representations. The results based on two data partitioning strategies and applicability domain (AD) analysis illustrate that the best prediction model based on Attentive FP yielded a balanced accuracy (BA) of 72.6%, an area under the receiver operating characteristic curve (AUC) value of 76.8% for the validation set, and a BA of 69.2%, an AUC of 75.9% for the test set. In addition, compared with existing filtering rules and models, our model achieved the highest BA value of 67.5% for the external validation set. Additionally, the shapley additive explanation (SHAP) and atom heatmap approaches were utilized to discover the important features and structural fragments related to hematotoxicity, which could offer helpful tips to detect undesired positive substances. Furthermore, matched molecular pair analysis (MMPA) and representative substructure derivation technique were employed to further characterize and investigate the transformation principles and distinctive structural features of hematotoxic chemicals. We believe that the novel graph-based deep learning algorithms and insightful interpretation presented in this study can be used as a trustworthy and effective tool to assess hematotoxicity in the development of new drugs.
AB - Hematotoxicity has been becoming a serious but overlooked toxicity in drug discovery. However, only a few in silico models have been reported for the prediction of hematotoxicity. In this study, we constructed a high-quality dataset comprising 759 hematotoxic compounds and 1623 nonhematotoxic compounds and then established a series of classification models based on a combination of seven machine learning (ML) algorithms and nine molecular representations. The results based on two data partitioning strategies and applicability domain (AD) analysis illustrate that the best prediction model based on Attentive FP yielded a balanced accuracy (BA) of 72.6%, an area under the receiver operating characteristic curve (AUC) value of 76.8% for the validation set, and a BA of 69.2%, an AUC of 75.9% for the test set. In addition, compared with existing filtering rules and models, our model achieved the highest BA value of 67.5% for the external validation set. Additionally, the shapley additive explanation (SHAP) and atom heatmap approaches were utilized to discover the important features and structural fragments related to hematotoxicity, which could offer helpful tips to detect undesired positive substances. Furthermore, matched molecular pair analysis (MMPA) and representative substructure derivation technique were employed to further characterize and investigate the transformation principles and distinctive structural features of hematotoxic chemicals. We believe that the novel graph-based deep learning algorithms and insightful interpretation presented in this study can be used as a trustworthy and effective tool to assess hematotoxicity in the development of new drugs.
UR - http://www.scopus.com/inward/record.url?scp=85143872975&partnerID=8YFLogxK
U2 - 10.1021/acs.jcim.2c01088
DO - 10.1021/acs.jcim.2c01088
M3 - Journal article
C2 - 36472475
AN - SCOPUS:85143872975
SN - 1549-9596
VL - 63
SP - 111
EP - 125
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
IS - 1
ER -