Skip to main navigation Skip to search Skip to main content

rweChemScreener: high-dimension mediation analysis detects potential effective chemical ingredients of traditional Chinese medicine from real-world clinical data

  • Qiguang Zheng
  • , Yuhang Yan
  • , Ling Xu
  • , Jingyi Lin
  • , Zixin Shu
  • , Dengying Yan
  • , Xiaodong Li
  • , Baoyan Liu
  • , Kam Wa Chan*
  • , Guanwei Fan*
  • , Xuezhong Zhou*
  • *Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

2 Citations (Scopus)

Abstract

Background and objectives: Traditional Chinese medicine (TCM) is a classical personalized medicine paradigm with a long history in managing various conditions (e.g., COVID-19, heart failure (HF)). The large-scale accumulated pool of herbal prescription data from electronic medical records (EMRs) provided an opportunity to develop novel regimens through data mining. However, TCM new drug discovery research is complicated by the complexity of ingredients contained in the TCM prescriptions. A causal inference framework integrating clinical reasoning and the underlying biochemistry of herbal prescriptions is essential for identifying the core effective ingredients from raw herbs. Materials and methods: We proposed a novel causal learning approach, real world evidence-based effective Chemical Screener (rweChemScreener), to identify the effective ingredients in TCM treatments using real-world data from EMRs. Two real-world inpatient registries of COVID-19 and HF treated with TCM were utilized as datasets. Our rweChemScreener utilized high-dimensional mediation analysis to identify and validate herbal ingredients that mediate the therapeutic effects of TCM treatments. These mediating ingredients were derived from a mapping process that integrated prescription records with herb-ingredient knowledge graph. In addition, a proxy therapeutic mediator variable was introduced by mapping the ingredients via eXtreme Gradient Boosting (XGBoost) regressor. The contribution of each ingredient to the mediation effect was estimated using SHapley Additive exPlanations (SHAP), and ingredients with top-ranking importances were considered as the potential effective ingredients of TCM treatments. Results: The multiple experimental assessments conducted on semi-synthetic datasets demonstrated that a more accurately estimated mediating effect by rweChemScreener when compared to the baseline model. We identified the top six and nine potential effective herbal ingredients for COVID-19 (e.g., apicidin, limonin, and tricin) and HF (e.g., rutin, beta-sitosterol, and salicylic acid), respectively. These effect of these ingredients was supported by a subsequent literature. Conclusion: This study suggested rweChemScreener, a novel causal machine learning approach, is capable of screening effective chemical ingredients of TCM herbal treatments through high-dimensional mediation analysis. The key contribution of this study lies in conducting screening for potential effective ingredients with real-world clinical efficacy as the direct orientation, via integrating real-world clinical and basic research data.

Original languageEnglish
Article number157225
Number of pages13
JournalPhytomedicine
Volume147
Early online date12 Sept 2025
DOIs
Publication statusPublished - Nov 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

User-Defined Keywords

  • COVID-19
  • Effective herb ingredient screening
  • Heart failure
  • High-dimension mediation analysis
  • Real-world clinical data
  • Traditional Chinese medicine

Fingerprint

Dive into the research topics of 'rweChemScreener: high-dimension mediation analysis detects potential effective chemical ingredients of traditional Chinese medicine from real-world clinical data'. Together they form a unique fingerprint.

Cite this