TY - JOUR
T1 - PKAN
T2 - Leveraging Kolmogorov-Arnold Networks and Multi-modal Learning for Peptide Prediction with Advanced Language Models
AU - Wang, Li
AU - Fu, Xiangzheng
AU - Ye, Xiucai
AU - Sakurai, Tetsuya
AU - Zeng, Xiangxiang
AU - Liu, Yiping
N1 - Funding Information:
This work was supported by the National Science and Technology Major Project (2023ZD0120902), the National Natural Science Foundation of China (U22A2037; 62425204; 62122025; 62450002; 62432011; 62472152; 62106073), Hunan Provincial Natural Science Foundation of China (Grant No. 2024JJ4015), the Beijing Natural Science Foundation (L248013), JSPS KAKENHI Grant Number JP23H03411 and JP22K12144, and the JST Grant Number JPMJPF2017.
Publisher Copyright:
© 2025 IEEE.
PY - 2025/4/17
Y1 - 2025/4/17
N2 - Peptides can offer highly specific biological activities, serving as essential mediators of intercellular signaling, which are critical for advancing precision medicine and drug development. Their primary structure can be depicted either as an amino acid sequence or as a chemical molecules consisting of atoms and chemical bonds. Large language models (LLMs) hold the potential to thoroughly elucidate the intricate intrinsic properties of peptides. Here we present the Peptide Kolmogorov-Arnold Network (PKAN), a framework leveraging multi-modal representations inspired by advanced language models for peptide activity and functionality prediction. Comparative experiments across tasks show that PKAN outperforms state-of-the-art models while maintaining a streamlined design with superior predictive capabilities. The multi-modal feature importance scoring, anchored in global structures and the significant marginal impacts of derived features on the model, coupled with intricate symbolic regression of specific activation functions, further demonstrates the robustness and precision of the PKAN framework in identifying and elucidating key determinants of peptide functionality. This work provides scientific evidence for investigating the complex mechanisms of peptide materials and supports the progression of peptide language paradigms in biology.
AB - Peptides can offer highly specific biological activities, serving as essential mediators of intercellular signaling, which are critical for advancing precision medicine and drug development. Their primary structure can be depicted either as an amino acid sequence or as a chemical molecules consisting of atoms and chemical bonds. Large language models (LLMs) hold the potential to thoroughly elucidate the intricate intrinsic properties of peptides. Here we present the Peptide Kolmogorov-Arnold Network (PKAN), a framework leveraging multi-modal representations inspired by advanced language models for peptide activity and functionality prediction. Comparative experiments across tasks show that PKAN outperforms state-of-the-art models while maintaining a streamlined design with superior predictive capabilities. The multi-modal feature importance scoring, anchored in global structures and the significant marginal impacts of derived features on the model, coupled with intricate symbolic regression of specific activation functions, further demonstrates the robustness and precision of the PKAN framework in identifying and elucidating key determinants of peptide functionality. This work provides scientific evidence for investigating the complex mechanisms of peptide materials and supports the progression of peptide language paradigms in biology.
KW - Antimicrobial peptide activity
KW - Interpretability
KW - Kolmogorov-Arnold network
KW - Multi-modality language learning
KW - Peptide prediction
UR - https://www.scopus.com/pages/publications/105003180319
U2 - 10.1109/JBHI.2025.3561846
DO - 10.1109/JBHI.2025.3561846
M3 - Journal article
AN - SCOPUS:105003180319
SN - 2168-2194
VL - 29
SP - 1
EP - 10
JO - IEEE Journal of Biomedical and Health Informatics
JF - IEEE Journal of Biomedical and Health Informatics
IS - 10
ER -