TY - JOUR
T1 - ChemSAR
T2 - An online pipelining platform for molecular SAR modeling
AU - Dong, Jie
AU - Yao, Zhi Jiang
AU - Zhu, Min Feng
AU - Wang, Ning Ning
AU - Lu, Ben
AU - Chen, Alex F.
AU - LYU, Aiping
AU - Miao, Hongyu
AU - Zeng, Wen Bin
AU - Cao, Dong Sheng
N1 - Publisher Copyright:
© 2017 The Author(s).
PY - 2017/5/4
Y1 - 2017/5/4
N2 - Background: In recent years, predictive models based on machine learning techniques have proven to be feasible and effective in drug discovery. However, to develop such a model, researchers usually have to combine multiple tools and undergo several different steps (e.g., RDKit or ChemoPy package for molecular descriptor calculation, ChemAxon Standardizer for structure preprocessing, scikit-learn package for model building, and ggplot2 package for statistical analysis and visualization, etc.). In addition, it may require strong programming skills to accomplish these jobs, which poses severe challenges for users without advanced training in computer programming. Therefore, an online pipelining platform that integrates a number of selected tools is a valuable and efficient solution that can meet the needs of related researchers. Results: This work presents a web-based pipelining platform, called ChemSAR, for generating SAR classification models of small molecules. The capabilities of ChemSAR include the validation and standardization of chemical structure representation, the computation of 783 1D/2D molecular descriptors and ten types of widely-used fingerprints for small molecules, the filtering methods for feature selection, the generation of predictive models via a step-by-step job submission process, model interpretation in terms of feature importance and tree visualization, as well as a helpful report generation system. The results can be visualized as high-quality plots and downloaded as local files. Conclusion: ChemSAR provides an integrated web-based platform for generating SAR classification models that will benefit cheminformatics and other biomedical users. It is freely available at: http://chemsar.scbdd.com. Graphical abstract.
AB - Background: In recent years, predictive models based on machine learning techniques have proven to be feasible and effective in drug discovery. However, to develop such a model, researchers usually have to combine multiple tools and undergo several different steps (e.g., RDKit or ChemoPy package for molecular descriptor calculation, ChemAxon Standardizer for structure preprocessing, scikit-learn package for model building, and ggplot2 package for statistical analysis and visualization, etc.). In addition, it may require strong programming skills to accomplish these jobs, which poses severe challenges for users without advanced training in computer programming. Therefore, an online pipelining platform that integrates a number of selected tools is a valuable and efficient solution that can meet the needs of related researchers. Results: This work presents a web-based pipelining platform, called ChemSAR, for generating SAR classification models of small molecules. The capabilities of ChemSAR include the validation and standardization of chemical structure representation, the computation of 783 1D/2D molecular descriptors and ten types of widely-used fingerprints for small molecules, the filtering methods for feature selection, the generation of predictive models via a step-by-step job submission process, model interpretation in terms of feature importance and tree visualization, as well as a helpful report generation system. The results can be visualized as high-quality plots and downloaded as local files. Conclusion: ChemSAR provides an integrated web-based platform for generating SAR classification models that will benefit cheminformatics and other biomedical users. It is freely available at: http://chemsar.scbdd.com. Graphical abstract.
KW - Cheminformatics
KW - Machine learning
KW - Molecular descriptors
KW - Online modeling
KW - QSAR/SAR
UR - http://www.scopus.com/inward/record.url?scp=85018742164&partnerID=8YFLogxK
U2 - 10.1186/s13321-017-0215-1
DO - 10.1186/s13321-017-0215-1
M3 - Journal article
AN - SCOPUS:85018742164
SN - 1758-2946
VL - 9
JO - Journal of Cheminformatics
JF - Journal of Cheminformatics
IS - 1
M1 - 27
ER -