TY - JOUR
T1 - Distributed D-core Decomposition over Large Directed Graphs
AU - Liao, Xuankun
AU - Liu, Qing
AU - Jiang, Jiaxin
AU - Huang, Xin
AU - Xu, Jianliang
AU - Choi, Byron
N1 - Funding information:
This work is supported by Hong Kong RGC Projects C2004-21GF, C6030-18GF, 12202221, 22200320, 12200021, 12201119, 12201518, and RIF Project R2002-20F.
Publisher copyright:
Copyright is held by the owner/author(s). Publication rights licensed to the VLDB Endowment.
PY - 2022/4
Y1 - 2022/4
N2 - Given a directed graph G and integers k and l, a D-core is the maximal subgraph H ⊆ G such that for every vertex of H, its in-degree and out-degree are no smaller than k and l, respectively. For a directed graph G, the problem of D-core decomposition aims to compute the non-empty D-cores for all possible values of k and l. In the literature, several peeling-based algorithms have been proposed to handle D-core decomposition. However, the peeling-based algorithms that work in a sequential fashion and require global graph information during processing are mainly designed for centralized settings, which cannot handle large-scale graphs efficiently in distributed settings. Motivated by this, we study the distributed D-core decomposition problem in this paper. We start by defining a concept called anchored coreness, based on which we propose a new H-index-based algorithm for distributed D-core decomposition. Furthermore, we devise a novel concept, namely skyline coreness, and show that the D-core decomposition problem is equivalent to the computation of skyline corenesses for all vertices. We design an efficient D-index to compute the skyline corenesses distributedly. We implement the proposed algorithms under both vertex-centric and block-centric distributed graph processing frameworks. Moreover, we theoretically analyze the algorithm and message complexities. Extensive experiments on large real-world graphs with billions of edges demonstrate the efficiency of the proposed algorithms in terms of both the running time and communication overhead.
AB - Given a directed graph G and integers k and l, a D-core is the maximal subgraph H ⊆ G such that for every vertex of H, its in-degree and out-degree are no smaller than k and l, respectively. For a directed graph G, the problem of D-core decomposition aims to compute the non-empty D-cores for all possible values of k and l. In the literature, several peeling-based algorithms have been proposed to handle D-core decomposition. However, the peeling-based algorithms that work in a sequential fashion and require global graph information during processing are mainly designed for centralized settings, which cannot handle large-scale graphs efficiently in distributed settings. Motivated by this, we study the distributed D-core decomposition problem in this paper. We start by defining a concept called anchored coreness, based on which we propose a new H-index-based algorithm for distributed D-core decomposition. Furthermore, we devise a novel concept, namely skyline coreness, and show that the D-core decomposition problem is equivalent to the computation of skyline corenesses for all vertices. We design an efficient D-index to compute the skyline corenesses distributedly. We implement the proposed algorithms under both vertex-centric and block-centric distributed graph processing frameworks. Moreover, we theoretically analyze the algorithm and message complexities. Extensive experiments on large real-world graphs with billions of edges demonstrate the efficiency of the proposed algorithms in terms of both the running time and communication overhead.
UR - http://vldb.org/pvldb/volumes/15/paper/Distributed%20D-core%20Decomposition%20over%20Large%20Directed%20Graphs
UR - http://www.scopus.com/inward/record.url?scp=85140972045&partnerID=8YFLogxK
U2 - 10.14778/3529337.3529340
DO - 10.14778/3529337.3529340
M3 - Conference article
SN - 2150-8097
VL - 15
SP - 1546
EP - 1558
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 8
ER -