TY - JOUR
T1 - Multi-Domain Networks Association for Biological Data Using Block Signed Graph Clustering
AU - Liu, Ye
AU - Ng, Kwok Po
AU - Wu, Stephen
N1 - Funding Information:
The research of this author was supported in part by the HKRGC GRF 1202715, 12306616, and 12200317 and HKBU RC-ICRS/16-17/03.
PY - 2020/3/1
Y1 - 2020/3/1
N2 - Multi-domain biological network association and clustering have attracted a lot of attention in biological data integration and understanding, which can provide a more global and accurate understanding of biological phenomenon. In many problems, different domains may have different cluster structures. Due to rapid growth of data collection from different sources, some domains may be strongly or weakly associated with the other domains. A key challenge is how to determine the degree of association among different domains, and to achieve accurate clustering results by data integration. In this paper, we propose an unsupervised learning approach for multi-domain network association by using block signed graph clustering. In particular, with consistency weights calculation, the proposed algorithm automatically identify domains relevant to each other strongly (or weakly) by assigning them larger (or smaller) weights. This approach not only significantly improve clustering accuracy but also understand multi-domain networks association. In each iteration of the proposed algorithm, we update consistency weights based on cluster structure of each domain, and then make use of different sets of eigenvectors to obtain different cluster structures in each domain. Experimental results on both synthetic data sets and real data sets (including neuron activity data and gene expression data) empirically demonstrate the effectiveness of the proposed algorithm in clustering performance and in domain association capability.
AB - Multi-domain biological network association and clustering have attracted a lot of attention in biological data integration and understanding, which can provide a more global and accurate understanding of biological phenomenon. In many problems, different domains may have different cluster structures. Due to rapid growth of data collection from different sources, some domains may be strongly or weakly associated with the other domains. A key challenge is how to determine the degree of association among different domains, and to achieve accurate clustering results by data integration. In this paper, we propose an unsupervised learning approach for multi-domain network association by using block signed graph clustering. In particular, with consistency weights calculation, the proposed algorithm automatically identify domains relevant to each other strongly (or weakly) by assigning them larger (or smaller) weights. This approach not only significantly improve clustering accuracy but also understand multi-domain networks association. In each iteration of the proposed algorithm, we update consistency weights based on cluster structure of each domain, and then make use of different sets of eigenvectors to obtain different cluster structures in each domain. Experimental results on both synthetic data sets and real data sets (including neuron activity data and gene expression data) empirically demonstrate the effectiveness of the proposed algorithm in clustering performance and in domain association capability.
KW - biological data
KW - Multi-domain association
KW - signed graph
KW - spectral clustering
KW - unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85049092188&partnerID=8YFLogxK
U2 - 10.1109/TCBB.2018.2848904
DO - 10.1109/TCBB.2018.2848904
M3 - Journal article
C2 - 29994480
AN - SCOPUS:85049092188
SN - 1545-5963
VL - 17
SP - 435
EP - 448
JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
IS - 2
M1 - 8395077
ER -