TY - JOUR
T1 - DrugReAlign: a multisource prompt framework for drug repurposing based on large language models
AU - Wei, Jinhang
AU - Zhuo, Linlin
AU - Fu, Xiangzheng
AU - Zeng, Xiang Xiang
AU - Wang, Li
AU - Zou, Quan
AU - Cao, Dongsheng
N1 - The work was supported by the National Natural Science Foundation of China (No. 62450002, 62302339, 62372158).
Publisher Copyright:
© The Author(s) 2024.
PY - 2024/10/8
Y1 - 2024/10/8
N2 - Drug repurposing is a promising approach in the field of drug discovery owing to its efficiency and cost-effectiveness. Most current drug repurposing models rely on specific datasets for training, which limits their predictive accuracy and scope. The number of both market-approved and experimental drugs is vast, forming an extensive molecular space. Due to limitations in parameter size and data volume, traditional drug-target interaction (DTI) prediction models struggle to generalize well within such a broad space. In contrast, large language models (LLMs), with their vast parameter sizes and extensive training data, demonstrate certain advantages in drug repurposing tasks. In our research, we introduce a novel drug repurposing framework, DrugReAlign, based on LLMs and multi-source prompt techniques, designed to fully exploit the potential of existing drugs efficiently. Leveraging LLMs, the DrugReAlign framework acquires general knowledge about targets and drugs from extensive human knowledge bases, overcoming the data availability limitations of traditional approaches. Furthermore, we collected target summaries and target-drug space interaction data from databases as multi-source prompts, substantially improving LLM performance in drug repurposing. We validated the efficiency and reliability of the proposed framework through molecular docking and DTI datasets. Significantly, our findings suggest a direct correlation between the accuracy of LLMs' target analysis and the quality of prediction outcomes. These findings signify that the proposed framework holds the promise of inaugurating a new paradigm in drug repurposing.
AB - Drug repurposing is a promising approach in the field of drug discovery owing to its efficiency and cost-effectiveness. Most current drug repurposing models rely on specific datasets for training, which limits their predictive accuracy and scope. The number of both market-approved and experimental drugs is vast, forming an extensive molecular space. Due to limitations in parameter size and data volume, traditional drug-target interaction (DTI) prediction models struggle to generalize well within such a broad space. In contrast, large language models (LLMs), with their vast parameter sizes and extensive training data, demonstrate certain advantages in drug repurposing tasks. In our research, we introduce a novel drug repurposing framework, DrugReAlign, based on LLMs and multi-source prompt techniques, designed to fully exploit the potential of existing drugs efficiently. Leveraging LLMs, the DrugReAlign framework acquires general knowledge about targets and drugs from extensive human knowledge bases, overcoming the data availability limitations of traditional approaches. Furthermore, we collected target summaries and target-drug space interaction data from databases as multi-source prompts, substantially improving LLM performance in drug repurposing. We validated the efficiency and reliability of the proposed framework through molecular docking and DTI datasets. Significantly, our findings suggest a direct correlation between the accuracy of LLMs' target analysis and the quality of prediction outcomes. These findings signify that the proposed framework holds the promise of inaugurating a new paradigm in drug repurposing.
KW - Drug repositioning
KW - Drug-target interactions
KW - Large Language Model
KW - Molecular docking
UR - http://www.scopus.com/inward/record.url?scp=85205980233&partnerID=8YFLogxK
UR - https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-024-02028-3
U2 - 10.1186/s12915-024-02028-3
DO - 10.1186/s12915-024-02028-3
M3 - Journal article
C2 - 39379930
AN - SCOPUS:85205980233
SN - 1741-7007
VL - 22
JO - BMC Biology
JF - BMC Biology
IS - 1
M1 - 226
ER -