TY - JOUR
T1 - Unifying Structural Proximity and Equivalence for Network Embedding
AU - Shi, Benyun
AU - Zhou, Chunpeng
AU - Qiu, Hongjun
AU - Xu, Xiaobin
AU - Liu, Jiming
N1 - Funding Information:
This work was supported in part by the Natural Science Foundation of Jiangsu Province, China, under Grant BK20161563, in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LQ19F030011, in part by the Hong Kong Research Grants Council under Grant RGC/HKBU12201619 and Grant 12201318, and in part by the NSFC-Zhejiang Joint Fund for the Integration of Industrialization and Informatization under Grant U1709215.
PY - 2019/8/1
Y1 - 2019/8/1
N2 - The fundamental purpose of network embedding is to automatically encode each node in a network as a low-dimensional vector, while at the same time preserving certain characteristics of the network. Based on the nodes' embeddings, downstream network analytic tasks such as community mining, node classification, and link prediction, can then be easily implemented using traditional machine learning methods. In recent years, extensive network embedding methods have been proposed based on factorization, random walks, deep learning, and so on. However, most of them focus mainly on preserving the structural proximity of network nodes, where highly interconnected nodes in a network will be represented closely together in the embedded vector space. While in many real-world networks, existing studies have revealed that high-order organizations (e.g., network motifs and graphlets) may be related to specific network functions. In this case, nodes far apart but with a similar organization in a network (i.e., structural equivalence) may have similar network functions. Accordingly, in this paper, we present a hybrid embedding method that unifies both structural proximity and equivalence (SPaE) of a network. Specifically, we adopt the concept of graphlet degree vector (GDV) to measure structural equivalence between network nodes. Through carrying out experiments on both synthetic and real-world datasets, we evaluate the performance of the hybrid embedding method in tasks of node clustering, node classification, and visualization. The results demonstrate that the proposed SPaE method outperforms several state-of-the-art methods when the network analytic tasks are not merely related to structural proximity. Finally, we also conduct experiments to evaluate the flexibility, robustness, and parameter sensitivity of the hybrid embedding method.
AB - The fundamental purpose of network embedding is to automatically encode each node in a network as a low-dimensional vector, while at the same time preserving certain characteristics of the network. Based on the nodes' embeddings, downstream network analytic tasks such as community mining, node classification, and link prediction, can then be easily implemented using traditional machine learning methods. In recent years, extensive network embedding methods have been proposed based on factorization, random walks, deep learning, and so on. However, most of them focus mainly on preserving the structural proximity of network nodes, where highly interconnected nodes in a network will be represented closely together in the embedded vector space. While in many real-world networks, existing studies have revealed that high-order organizations (e.g., network motifs and graphlets) may be related to specific network functions. In this case, nodes far apart but with a similar organization in a network (i.e., structural equivalence) may have similar network functions. Accordingly, in this paper, we present a hybrid embedding method that unifies both structural proximity and equivalence (SPaE) of a network. Specifically, we adopt the concept of graphlet degree vector (GDV) to measure structural equivalence between network nodes. Through carrying out experiments on both synthetic and real-world datasets, we evaluate the performance of the hybrid embedding method in tasks of node clustering, node classification, and visualization. The results demonstrate that the proposed SPaE method outperforms several state-of-the-art methods when the network analytic tasks are not merely related to structural proximity. Finally, we also conduct experiments to evaluate the flexibility, robustness, and parameter sensitivity of the hybrid embedding method.
KW - Graphlet degree vector
KW - network embedding
KW - node classification
KW - structural equivalence
KW - structural proximity
UR - http://www.scopus.com/inward/record.url?scp=85071461378&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2019.2932396
DO - 10.1109/ACCESS.2019.2932396
M3 - Journal article
AN - SCOPUS:85071461378
SN - 2169-3536
VL - 7
SP - 106124
EP - 106138
JO - IEEE Access
JF - IEEE Access
M1 - 8784281
ER -