Abstract
Graph partitioning is an important problem to the performance and cost optimization of graph analytics in geo-distributed environments. Modern hybrid-cut model is expected to obtain better performance and cost optimizations than traditional partitioning models, but can further complicate geo-distributed graph partitioning which is already a challenging problem due to large graph sizes and network heterogeneities of geo-distributed DCs. Existing studies usually adopt heuristic-based methods to achieve fast partitioning for large graphs, which unfortunately sacrifices optimization effectiveness. Further, graph structures of many applications can change at various frequencies. Dynamic partitioning methods usually focus on achieving low latency to quickly adapt to changes, which may again sacrifice partitioning effectiveness. Also, such methods are not aware of the dynamicity of graphs and can over sacrifice effectiveness for unnecessarily low latency.
In this paper, we propose RLCut, which uses Reinforcement Learning (RL) to help taming the complexity of the problem. Specifically, RLCut uses multi-agent learning which is more efficient than single agent RL and incorporates a sampling based optimization to adaptively control the training process to satisfy required trade-off between partitioning effectiveness and efficiency according to graph dynamicity. Experiments using real cloud DCs and real-world graphs show that, compared to state-of-the-art static partitioning methods, RLCut improves the performance of geo-distributed graph analytics by 10%-100% with comparable overhead. When users tolerate longer partitioning overhead, we can further improve the performance by up to 43%. With varying graph changing frequencies, RLCut can improve the performance by up to 60% compared to state-of-the-art dynamic partitioning.
Original language | English |
---|---|
Title of host publication | Proceedings - 2022 IEEE 38th International Conference on Data Engineering, ICDE 2022 |
Publisher | IEEE Computer Society |
Pages | 2818-2830 |
Number of pages | 13 |
ISBN (Electronic) | 9781665408837 |
ISBN (Print) | 9781665408844 |
DOIs | |
Publication status | Published - May 2022 |
Event | 38th IEEE International Conference on Data Engineering, ICDE 2022 - Virtual, Malaysia Duration: 9 May 2022 → 12 May 2022 https://icde2022.ieeecomputer.my/ (Conference Website) |
Publication series
Name | Proceedings - International Conference on Data Engineering |
---|---|
ISSN (Print) | 1084-4627 |
ISSN (Electronic) | 2375-026X |
Conference
Conference | 38th IEEE International Conference on Data Engineering, ICDE 2022 |
---|---|
Country/Territory | Malaysia |
City | Virtual |
Period | 9/05/22 → 12/05/22 |
Internet address |
|
Scopus Subject Areas
- Software
- Signal Processing
- Information Systems
User-Defined Keywords
- geo-distributed DCs
- graph partitioning
- reinforcement learning