TY - JOUR
T1 - On the spectral characterization and scalable mining of network communities
AU - Bo, Yang
AU - Liu, Jiming
AU - Feng, Jianfeng
N1 - Funding Information:
The authors would like to express their thanks to the anonymous reviewers for their constructive comments and suggestions, and to Anton Bovier for the helpful discussion with him. This work was supported in part by the National Natural Science Foundation of China Grants (60873149, 60973088, 61133011, and 61170092) and the National High-Tech Research and Development Plan of China (2006AA10Z245 and 2006AA10A309), in part by the Open Project Program of the National Laboratory of Pattern Recognition (NLPR), the Basic Scientific Research Fund of Chinese Ministry of Education (200903177), and the Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, and in part by a Hong Kong Baptist University FRG grant (06-07-II-66). Jianfeng Feng is also supported by the Future and Emerging Technologies (FET) programme within the Seventh Frame-work Programme for Research of the European Commission, under the FET-OPEN grant agreement BION (213219).
PY - 2012/2
Y1 - 2012/2
N2 - Network communities refer to groups of vertices within which their connecting links are dense but between which they are sparse. A network community mining problem (or NCMP for short) is concerned with the problem of finding all such communities from a given network. A wide variety of applications can be formulated as NCMPs, ranging from social and/or biological network analysis to web mining and searching. So far, many algorithms addressing NCMPs have been developed and most of them fall into the categories of either optimization based or heuristic methods. Distinct from the existing studies, the work presented in this paper explores the notion of network communities and their properties based on the dynamics of a stochastic model naturally introduced. In the paper, a relationship between the hierarchical community structure of a network and the local mixing properties of such a stochastic model has been established with the large-deviation theory. Topological information regarding to the community structures hidden in networks can be inferred from their spectral signatures. Based on the above-mentioned relationship, this work proposes a general framework for characterizing, analyzing, and mining network communities. Utilizing the two basic properties of metastability, i.e., being locally uniform and temporarily fixed, an efficient implementation of the framework, called the LM algorithm, has been developed that can scalably mine communities hidden in large-scale networks. The effectiveness and efficiency of the LM algorithm have been theoretically analyzed as well as experimentally validated.
AB - Network communities refer to groups of vertices within which their connecting links are dense but between which they are sparse. A network community mining problem (or NCMP for short) is concerned with the problem of finding all such communities from a given network. A wide variety of applications can be formulated as NCMPs, ranging from social and/or biological network analysis to web mining and searching. So far, many algorithms addressing NCMPs have been developed and most of them fall into the categories of either optimization based or heuristic methods. Distinct from the existing studies, the work presented in this paper explores the notion of network communities and their properties based on the dynamics of a stochastic model naturally introduced. In the paper, a relationship between the hierarchical community structure of a network and the local mixing properties of such a stochastic model has been established with the large-deviation theory. Topological information regarding to the community structures hidden in networks can be inferred from their spectral signatures. Based on the above-mentioned relationship, this work proposes a general framework for characterizing, analyzing, and mining network communities. Utilizing the two basic properties of metastability, i.e., being locally uniform and temporarily fixed, an efficient implementation of the framework, called the LM algorithm, has been developed that can scalably mine communities hidden in large-scale networks. The effectiveness and efficiency of the LM algorithm have been theoretically analyzed as well as experimentally validated.
KW - community structure
KW - large-deviation theory
KW - local mixing
KW - Markov chain
KW - Social network
UR - http://www.scopus.com/inward/record.url?scp=84555204216&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2010.233
DO - 10.1109/TKDE.2010.233
M3 - Journal article
AN - SCOPUS:84555204216
SN - 1041-4347
VL - 24
SP - 326
EP - 337
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 2
M1 - 5645622
ER -