TY - JOUR
T1 - Detecting Communities with Multiplex Semantics by Distinguishing Background, General, and Specialized Topics
AU - Jin, Di
AU - Wang, Kunzeng
AU - Zhang, Ge
AU - Jiao, Pengfei
AU - He, Dongxiao
AU - Fogelman-Soulie, Francoise
AU - Huang, Xin
N1 - Funding Information:
Supported by the National Natural Science Foundation of China (No. 61772361, 61876128, 61702435 and 61902278) and the Research Grants Council of the Hong Kong Special Administrative Region, China (No. HKBU 12200917).
PY - 2020/11/1
Y1 - 2020/11/1
N2 - Finding semantic communities using network topology and contents together is a hot topic in community detection. Existing methods often use word attributes in an indiscriminate way to help finding communities. Through analysis we find that, words in networked contents often embody a hierarchical semantic structure. Some words reflect a background topic of the whole network with all communities, some imply the high-level general topic covering several topic-related communities, and some imply the high-resolution specialized topic to describe each community. Ignoring such semantic structures often leads to defects in depicting networked contents where deep semantics are not fully utilized. To solve this problem, we propose a new Bayesian probabilistic model. By distinguishing words from either a background topic or some two-level topics (i.e., general and specialized topics), this model not only better utilizes the networked contents to help finding communities, but also provides a clearer multiplex semantic community interpretation. We then give an efficient variational algorithm for model inference. The superiority of this new approach is demonstrated by comparing with ten state-of-the-art methods on nine real networks and an artificial benchmark. A case study is further provided to show its strong ability in deep semantic interpretation of communities.
AB - Finding semantic communities using network topology and contents together is a hot topic in community detection. Existing methods often use word attributes in an indiscriminate way to help finding communities. Through analysis we find that, words in networked contents often embody a hierarchical semantic structure. Some words reflect a background topic of the whole network with all communities, some imply the high-level general topic covering several topic-related communities, and some imply the high-resolution specialized topic to describe each community. Ignoring such semantic structures often leads to defects in depicting networked contents where deep semantics are not fully utilized. To solve this problem, we propose a new Bayesian probabilistic model. By distinguishing words from either a background topic or some two-level topics (i.e., general and specialized topics), this model not only better utilizes the networked contents to help finding communities, but also provides a clearer multiplex semantic community interpretation. We then give an efficient variational algorithm for model inference. The superiority of this new approach is demonstrated by comparing with ten state-of-the-art methods on nine real networks and an artificial benchmark. A case study is further provided to show its strong ability in deep semantic interpretation of communities.
KW - Bayesian probabilistic model
KW - Community detection
KW - multiplex semantics
KW - variational inference
UR - http://www.scopus.com/inward/record.url?scp=85072522691&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2019.2937298
DO - 10.1109/TKDE.2019.2937298
M3 - Journal article
AN - SCOPUS:85072522691
SN - 1041-4347
VL - 32
SP - 2144
EP - 2158
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 11
M1 - 8832212
ER -