ChemSAR: An online pipelining platform for molecular SAR modeling

Jie Dong, Zhi Jiang Yao, Min Feng Zhu, Ning Ning Wang, Ben Lu, Alex F. Chen, Aiping LYU, Hongyu Miao, Wen Bin Zeng, Dong Sheng Cao*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

50 Citations (Scopus)


Background: In recent years, predictive models based on machine learning techniques have proven to be feasible and effective in drug discovery. However, to develop such a model, researchers usually have to combine multiple tools and undergo several different steps (e.g., RDKit or ChemoPy package for molecular descriptor calculation, ChemAxon Standardizer for structure preprocessing, scikit-learn package for model building, and ggplot2 package for statistical analysis and visualization, etc.). In addition, it may require strong programming skills to accomplish these jobs, which poses severe challenges for users without advanced training in computer programming. Therefore, an online pipelining platform that integrates a number of selected tools is a valuable and efficient solution that can meet the needs of related researchers. Results: This work presents a web-based pipelining platform, called ChemSAR, for generating SAR classification models of small molecules. The capabilities of ChemSAR include the validation and standardization of chemical structure representation, the computation of 783 1D/2D molecular descriptors and ten types of widely-used fingerprints for small molecules, the filtering methods for feature selection, the generation of predictive models via a step-by-step job submission process, model interpretation in terms of feature importance and tree visualization, as well as a helpful report generation system. The results can be visualized as high-quality plots and downloaded as local files. Conclusion: ChemSAR provides an integrated web-based platform for generating SAR classification models that will benefit cheminformatics and other biomedical users. It is freely available at: Graphical abstract.

Original languageEnglish
Article number27
JournalJournal of Cheminformatics
Issue number1
Publication statusPublished - 4 May 2017

Scopus Subject Areas

  • Computer Science Applications
  • Physical and Theoretical Chemistry
  • Computer Graphics and Computer-Aided Design
  • Library and Information Sciences

User-Defined Keywords

  • Cheminformatics
  • Machine learning
  • Molecular descriptors
  • Online modeling


Dive into the research topics of 'ChemSAR: An online pipelining platform for molecular SAR modeling'. Together they form a unique fingerprint.

Cite this