Abstract
Multi-domain biological network association and clustering have attracted a lot of attention in biological data integration and understanding, which can provide a more global and accurate understanding of biological phenomenon. In many problems, different domains may have different cluster structures. Due to rapid growth of data collection from different sources, some domains may be strongly or weakly associated with the other domains. A key challenge is how to determine the degree of association among different domains, and to achieve accurate clustering results by data integration. In this paper, we propose an unsupervised learning approach for multi-domain network association by using block signed graph clustering. In particular, with consistency weights calculation, the proposed algorithm automatically identify domains relevant to each other strongly (or weakly) by assigning them larger (or smaller) weights. This approach not only significantly improve clustering accuracy but also understand multi-domain networks association. In each iteration of the proposed algorithm, we update consistency weights based on cluster structure of each domain, and then make use of different sets of eigenvectors to obtain different cluster structures in each domain. Experimental results on both synthetic data sets and real data sets (including neuron activity data and gene expression data) empirically demonstrate the effectiveness of the proposed algorithm in clustering performance and in domain association capability.
Original language | English |
---|---|
Article number | 8395077 |
Pages (from-to) | 435-448 |
Number of pages | 14 |
Journal | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
Volume | 17 |
Issue number | 2 |
DOIs | |
Publication status | Published - 1 Mar 2020 |
Scopus Subject Areas
- Biotechnology
- Genetics
- Applied Mathematics
User-Defined Keywords
- biological data
- Multi-domain association
- signed graph
- spectral clustering
- unsupervised learning