## Project Details

### Description

The graph model has been widely used to model complex relationships among entities. Examples include DBLP (citation relationships), Facebook (friend relationships) and the whole internet (link relationships). Many algorithms and systems are designed to work with these graphs. However, these graphs are growing more rapidly than the capacity of machines. For example, according to the statistics, in 2014, Facebook had more than 1 billion active users and more than 130 billion friendship edges. Just keeping all the edges of such a graph in memory would require hundreds of gigabytes. Moreover, the rate of increase of active users from 2014 to 2015 was 12%. Such a rate of growth makes it harder to work on these graphs as the main memory cannot hold the whole graphs and without careful design, a graph algorithm usually suffers from high latency when the graph has to be kept on disk. On the other hand, distributed systems are usually inefficient when dealing with many graph operations due to large communication costs. In this project, we focus on I/O efficient graph algorithms in which the size of the graph exceeds the size of the main memory and therefore the graph has to be kept on disk. The key issues we will investigate include graph storage, I/O efficient graph algorithms and system design. We aim to deal with operations on massive graphs by accessing only a small part of the whole. Previous studies usually consider subgraphs of the original graph iteratively with heuristic optimizations. Different from that, our methodologies focus on dividing and contracting these graphs to obtain a set of contracted graphs that can be used as the input to graph algorithms to reduce the cost. The contracted graphs are not certain to be the subgraphs of the original, but capture more structural properties. Under such a framework, we will design graph algorithms for different fundamental graph operations. In addition, we will integrate multiple graph operations to build a unified graph processing system.

The importance of this project is in its exploration of a new framework for I/O efficient graph processing. We will concentrate on how to contract graphs, how to make the graph algorithms work with these graphs and how to design an effective indexing scheme to reduce the graph size. We will conduct extensive performance studies using real-world data in our prototype system to demonstrate the strength of our approach.

The importance of this project is in its exploration of a new framework for I/O efficient graph processing. We will concentrate on how to contract graphs, how to make the graph algorithms work with these graphs and how to design an effective indexing scheme to reduce the graph size. We will conduct extensive performance studies using real-world data in our prototype system to demonstrate the strength of our approach.

Status | Finished |
---|---|

Effective start/end date | 1/01/17 → 31/12/19 |

## Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.