Background: High-throughput screening (HTS) and virtual screening (VS) have been widely used to identify potential hits from large chemical libraries. However, the frequent occurrence of 'noisy compounds' in the screened libraries, such as compounds with poor drug-likeness, poor selectivity or potential toxicity, has greatly weakened the enrichment capability of HTS and VS campaigns. Therefore, the development of comprehensive and credible tools to detect noisy compounds from chemical libraries is urgently needed in early stages of drug discovery.
Results: In this study, we developed a freely available integrated python library for negative design, called Scopy, which supports the functions of data preparation, calculation of descriptors, scaffolds and screening filters, and data visualization. The current version of Scopy can calculate 39 basic molecular properties, 3 comprehensive molecular evaluation scores, 2 types of molecular scaffolds, 6 types of substructure descriptors and 2 types of fingerprints. A number of important screening rules are also provided by Scopy, including 15 drug-likeness rules (13 drug-likeness rules and 2 building block rules), 8 frequent hitter rules (four assay interference substructure filters and four promiscuous compound substructure filters), and 11 toxicophore filters (five human-related toxicity substructure filters, three environment-related toxicity substructure filters and three comprehensive toxicity substructure filters). Moreover, this library supports four different visualization functions to help users to gain a better understanding of the screened data, including basic feature radar chart, feature-feature-related scatter diagram, functional group marker gram and cloud gram.
Conclusion: Scopy provides a comprehensive Python package to filter out compounds with undesirable properties or substructures, which will benefit the design of high-quality chemical libraries for drug design and discovery. It is freely available at https://github.com/kotori-y/Scopy.
Scopus Subject Areas
- Information Systems
- Molecular Biology
- frequent hitters
- negative design