TY - JOUR
T1 - ClusPar: A Game-Theoretic Approach for Efficient and Scalable Streaming Edge Partitioning
AU - Ding, Zezhong
AU - Kong, Deyu
AU - Zhang, Zhuoxu
AU - Xie, Xike
AU - Xu, Jianliang
N1 - This work is supported by NSFC (No.61772492, 62072428).
PY - 2025/1
Y1 - 2025/1
N2 - Streaming edge partitioning plays a crucial role in the distributed processing of large-scale web graphs, such as pagerank. The quality of partitioning is of utmost importance and directly affects the runtime cost of distributed graph processing. However, streaming graph clustering, a key component of mainstream streaming edge partitioning, is vertex-centric. This incurs a mismatch with the edge-centric partitioning strategy, necessitating additional post-processing and several graph traversals to transition from vertex-centric clusters to edge-centric partitions. This transition not only adds extra runtime overhead but also risks a decline in partitioning quality. In this paper, we propose a novel algorithm, called ClusPar, to address the problem of streaming edge partitioning. The ClusPar framework consists of two steps, streaming edge clustering and edge cluster partitioning. Different from prior studies, the first step traverses the input graph in a single pass to generate edge-centric clusters, while the second step applies game theory over these edge-centric clusters to produce partitions. Extensive experiments show that ClusPar outperforms the state-of-the-art streaming edge partitioning methods in terms of the partitioning quality, efficiency, and scalability.
AB - Streaming edge partitioning plays a crucial role in the distributed processing of large-scale web graphs, such as pagerank. The quality of partitioning is of utmost importance and directly affects the runtime cost of distributed graph processing. However, streaming graph clustering, a key component of mainstream streaming edge partitioning, is vertex-centric. This incurs a mismatch with the edge-centric partitioning strategy, necessitating additional post-processing and several graph traversals to transition from vertex-centric clusters to edge-centric partitions. This transition not only adds extra runtime overhead but also risks a decline in partitioning quality. In this paper, we propose a novel algorithm, called ClusPar, to address the problem of streaming edge partitioning. The ClusPar framework consists of two steps, streaming edge clustering and edge cluster partitioning. Different from prior studies, the first step traverses the input graph in a single pass to generate edge-centric clusters, while the second step applies game theory over these edge-centric clusters to produce partitions. Extensive experiments show that ClusPar outperforms the state-of-the-art streaming edge partitioning methods in terms of the partitioning quality, efficiency, and scalability.
KW - Distributed Graph System
KW - Parallelization
KW - Streaming Edge Partitioning
UR - https://ieeexplore.ieee.org/document/10707328
UR - http://www.scopus.com/inward/record.url?scp=85206947149&partnerID=8YFLogxK
U2 - 10.1109/TC.2024.3475568
DO - 10.1109/TC.2024.3475568
M3 - Journal article
AN - SCOPUS:85206947149
SN - 0018-9340
VL - 74
SP - 116
EP - 130
JO - IEEE Transactions on Computers
JF - IEEE Transactions on Computers
IS - 1
ER -