TY - JOUR
T1 - ClusPar: A Game-Theoretic Approach for Efficient and Scalable Streaming Edge Partitioning
AU - Ding, Zezhong
AU - Kong, Deyu
AU - Zhang, Zhuoxu
AU - Xie, Xike
AU - Xu, Jianliang
N1 - This work was supported in part by the NSFC under Grant 62472400 and Grant 62072428 and in part by Hong Kong RGC under Grant
R1015-23 and Grant 12202024.
Publisher Copyright:
0018-9340 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.
PY - 2025/1
Y1 - 2025/1
N2 - Streaming edge partitioning plays a crucial role in the distributed processing of large-scale web graphs, such as pagerank. The quality of partitioning is of utmost importance and directly affects the runtime cost of distributed graph processing. However, streaming graph clustering, a key component of mainstream streaming edge partitioning, is vertex-centric. This incurs a mismatch with the edge-centric partitioning strategy, necessitating additional post-processing and several graph traversals to transition from vertex-centric clusters to edge-centric partitions. This transition not only adds extra runtime overhead but also risks a decline in partitioning quality. In this paper, we propose a novel algorithm, called ClusPar, to address the problem of streaming edge partitioning. The ClusPar framework consists of two steps, streaming edge clustering and edge cluster partitioning. Different from prior studies, the first step traverses the input graph in a single pass to generate edge-centric clusters, while the second step applies game theory over these edge-centric clusters to produce partitions. Extensive experiments show that ClusPar outperforms the state-of-the-art streaming edge partitioning methods in terms of the partitioning quality, efficiency, and scalability.
AB - Streaming edge partitioning plays a crucial role in the distributed processing of large-scale web graphs, such as pagerank. The quality of partitioning is of utmost importance and directly affects the runtime cost of distributed graph processing. However, streaming graph clustering, a key component of mainstream streaming edge partitioning, is vertex-centric. This incurs a mismatch with the edge-centric partitioning strategy, necessitating additional post-processing and several graph traversals to transition from vertex-centric clusters to edge-centric partitions. This transition not only adds extra runtime overhead but also risks a decline in partitioning quality. In this paper, we propose a novel algorithm, called ClusPar, to address the problem of streaming edge partitioning. The ClusPar framework consists of two steps, streaming edge clustering and edge cluster partitioning. Different from prior studies, the first step traverses the input graph in a single pass to generate edge-centric clusters, while the second step applies game theory over these edge-centric clusters to produce partitions. Extensive experiments show that ClusPar outperforms the state-of-the-art streaming edge partitioning methods in terms of the partitioning quality, efficiency, and scalability.
KW - Distributed Graph System
KW - Parallelization
KW - Streaming Edge Partitioning
UR - http://www.scopus.com/inward/record.url?scp=86000386701&partnerID=8YFLogxK
U2 - 10.1109/TC.2024.3475568
DO - 10.1109/TC.2024.3475568
M3 - Journal article
AN - SCOPUS:86000386701
SN - 0018-9340
VL - 74
SP - 116
EP - 130
JO - IEEE Transactions on Computers
JF - IEEE Transactions on Computers
IS - 1
ER -