TY - GEN
T1 - Improving large graph processing on partitioned graphs in the cloud
AU - Chen, Rishan
AU - Weng, Xuetian
AU - He, Bingsheng
AU - Yang, Mao
AU - CHOI, Koon Kau
AU - Li, Xiaoming
N1 - Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.
PY - 2012
Y1 - 2012
N2 - As the study of large graphs over hundreds of gigabytes becomes increasingly popular for various data-intensive applications in cloud computing, developing large graph processing systems has become a hot and fruitful research area. Many of those existing systems support a vertex-oriented execution model and allow users to develop custom logics on vertices. However, the inherently random access pattern on the vertex-oriented computation generates a significant amount of network traffic. While graph partitioning is known to be effective to reduce network traffic in graph processing, there is little attention given to how graph partitioning can be effectively integrated into large graph processing in the cloud environment. In this paper, we develop a novel graph partitioning framework to improve the network performance of graph partitioning itself, partitioned graph storage and vertex-oriented graph processing. All optimizations are specifically designed for the cloud network environment. In experiments, we develop a system prototype following Pregel (the latest vertex-oriented graph engine by Google), and extend it with our graph partitioning framework. We conduct the experiments with a real-world social network and synthetic graphs over 100GB each in a local cluster and on Amazon EC2. Our experimental results demonstrate the efficiency of our graph partitioning framework, and the effectiveness of network performance aware optimizations on the large graph processing engine.
AB - As the study of large graphs over hundreds of gigabytes becomes increasingly popular for various data-intensive applications in cloud computing, developing large graph processing systems has become a hot and fruitful research area. Many of those existing systems support a vertex-oriented execution model and allow users to develop custom logics on vertices. However, the inherently random access pattern on the vertex-oriented computation generates a significant amount of network traffic. While graph partitioning is known to be effective to reduce network traffic in graph processing, there is little attention given to how graph partitioning can be effectively integrated into large graph processing in the cloud environment. In this paper, we develop a novel graph partitioning framework to improve the network performance of graph partitioning itself, partitioned graph storage and vertex-oriented graph processing. All optimizations are specifically designed for the cloud network environment. In experiments, we develop a system prototype following Pregel (the latest vertex-oriented graph engine by Google), and extend it with our graph partitioning framework. We conduct the experiments with a real-world social network and synthetic graphs over 100GB each in a local cluster and on Amazon EC2. Our experimental results demonstrate the efficiency of our graph partitioning framework, and the effectiveness of network performance aware optimizations on the large graph processing engine.
KW - Cloud computing
KW - Data center network
KW - Graph partitioning
KW - Large graph processing
UR - http://www.scopus.com/inward/record.url?scp=84870544283&partnerID=8YFLogxK
U2 - 10.1145/2391229.2391232
DO - 10.1145/2391229.2391232
M3 - Conference proceeding
AN - SCOPUS:84870544283
SN - 9781450317610
T3 - Proceedings of the 3rd ACM Symposium on Cloud Computing, SoCC 2012
BT - Proceedings of the 3rd ACM Symposium on Cloud Computing, SoCC 2012
T2 - 3rd ACM Symposium on Cloud Computing, SoCC 2012
Y2 - 14 October 2012 through 17 October 2012
ER -