Abstract
Protein-RNA complexes, particularly those involving RNA-binding proteins and long non-coding RNAs (lncRNA), are commonly found to influence gene expression and mediate fundamental cellular processes. Despite significant advances in representations for these biological sequences, sequence decomposition based on k-mer generally results in fix-length substrings, failing to detect the information of variable-length biological functional regions. In this paper, we develop a concept of expressiveness for k-mer decompositions as a theoretical underpinning for traversing all k-mer decompositions. Based on this concept, we propose an advanced approach, BERTDGA-LPI, to detect the information of variable-length biological functional regions utilizing dynamic graph attention and to capture the influence of RNA and protein context leveraging pretrained language models. The experimental results demonstrate the outperformance of BERTDGA-LPI over state-of-the-art methods across two homo sapiens datasets, one plant species dataset, and two species-unspecific datasets. Furthermore, BERTDGA-LPI is validated as effective in predicting unknown RNA-protein interactions (RPI) with 100% prediction accuracy in six independent validation sets from different species. This study lays a theoretical underpinning for traversing all k-mer decompositions and innovatively offers a broadly applicable and efficient tool for LPI prediction and RPI prediction based only on sequences.
| Original language | English |
|---|---|
| Pages (from-to) | 3175-3187 |
| Number of pages | 13 |
| Journal | IEEE Transactions on Computational Biology and Bioinformatics |
| Volume | 22 |
| Issue number | 6 |
| Early online date | 25 Sept 2025 |
| DOIs | |
| Publication status | Published - Nov 2025 |
User-Defined Keywords
- Graph Neural Network
- K-Mer
- LncRNA-Protein Interaction
- Long Non-Coding RNA
- Pretrained Language Model
Fingerprint
Dive into the research topics of 'Dynamic Graph Attention Meets Pretrained Language Models: Adaptive K-Mer Decomposition for LncRNA-Protein Interaction Prediction'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver