TY - JOUR
T1 - A new clustering algorithm for genes with multiple cancer diseases by self-consistent field iteration method
AU - Liu, Ye
AU - Ng, Michael K.
N1 - M. Ng’s work was supported by HKRGC GRF 12300218, 12300519, 17201020 and 17300021.
Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature.
PY - 2022/12
Y1 - 2022/12
N2 - An increasing body of literature shows that predicting gene clusters related to human cancer disease using biological networks is significant in bioinformation, it would help to understand disease mechanisms, and benefit the development of diagnostics and therapeutics. However, due to noise and preprocessing of data, a single network or graph generated from one cancer disease is insufficient to cluster genes. As some cancer diseases are correlated with each other in practice, by integrating several gene expression networks generated from those associated cancer diseases, more accurate and robust partition of genes can be obtained. In this paper, we propose a multiple graph spectral clustering method with graph association, it helps us to discover functional modules in each cancer disease more accurately and comprehensively, meanwhile the degree of association among cancer diseases can also be determined. Our idea is to construct a block adjacency matrix to integrate the adjacency matrix of each graph and the degree of association among multiple graphs together, then spectral clustering would be employed to calculate clusters for each graph. The proposed algorithm is based on a self-consistent field iteration such that both the degree of association and gene clusters can be identified during iterations. Moreover, we establish the condition under which convergence of the proposed algorithm is guaranteed with some assumptions. Experimental results on two datasets of human cancer diseases are presented, which demonstrate that the proposed method can not only identify gene functional modules, but also calculate the degree of association among different cancer diseases accurately.
AB - An increasing body of literature shows that predicting gene clusters related to human cancer disease using biological networks is significant in bioinformation, it would help to understand disease mechanisms, and benefit the development of diagnostics and therapeutics. However, due to noise and preprocessing of data, a single network or graph generated from one cancer disease is insufficient to cluster genes. As some cancer diseases are correlated with each other in practice, by integrating several gene expression networks generated from those associated cancer diseases, more accurate and robust partition of genes can be obtained. In this paper, we propose a multiple graph spectral clustering method with graph association, it helps us to discover functional modules in each cancer disease more accurately and comprehensively, meanwhile the degree of association among cancer diseases can also be determined. Our idea is to construct a block adjacency matrix to integrate the adjacency matrix of each graph and the degree of association among multiple graphs together, then spectral clustering would be employed to calculate clusters for each graph. The proposed algorithm is based on a self-consistent field iteration such that both the degree of association and gene clusters can be identified during iterations. Moreover, we establish the condition under which convergence of the proposed algorithm is guaranteed with some assumptions. Experimental results on two datasets of human cancer diseases are presented, which demonstrate that the proposed method can not only identify gene functional modules, but also calculate the degree of association among different cancer diseases accurately.
KW - Gene clusters
KW - Gene coexpression data
KW - Nonlinear eigenvalue problem
KW - Self-consistent field iteration
KW - Spectral clustering
UR - http://www.scopus.com/inward/record.url?scp=85127950302&partnerID=8YFLogxK
U2 - 10.1007/s13721-022-00362-6
DO - 10.1007/s13721-022-00362-6
M3 - Journal article
AN - SCOPUS:85127950302
SN - 2192-6662
VL - 11
JO - Network Modeling Analysis in Health Informatics and Bioinformatics
JF - Network Modeling Analysis in Health Informatics and Bioinformatics
IS - 1
M1 - 20
ER -