Realtime Relevance Search over Massive Attributed Graphs

Project: Research project

Project Details


Relevance search over graph-structured data aims to retrieve from the graph the nodes that are most similar to the query node. This task serves as building blocks in various systems including graph databases, search engines, and recommendation systems to support ranking and recommendation services. Extensive efforts have been made in the past two decades towards designing effective relevance measures and efficient search/query algorithms. In most of them, node attributes (e.g., gender of a user and the DNA sequence of a gene) are largely ignored, and hence, leading to compromised result utility. It is highly challenging to seamlessly incorporate attribute information into the topological proximities as nodes and attributes are two heterogeneous objects with inherently different traits and their respective importance varies from node to node. Moreover, in the era of big data, graph data grows explosively in size and dynamicity, while many online services require real-time responses (within less than 100ms), pushing the difficulty of relevance search over attributed graphs to a whole new level. To our knowledge, none of the existing studies address these two challenges.
Motivated by this, this project aims to overcome the above limitations through three major contributions. First, we plan to design a new relevance measure specially catered for attributed graphs, which learns personalized weights for each node to optimize the combination of attribute and topology information for enhanced result utility. We will also propose a number of non-trivial optimization techniques to enable efficient parameter learning on massive attributed graphs. In addition, to realize the real-time relevance search queries on large static graphs, we intend to devise novel index structures based on the candidate sets generated by a carefully-crafted network embedding approach. Equipped with the indexes, an online bidirectional search algorithm will be developed to support real-time query processing. Finally, we will investigate dynamic network embedding techniques and incremental index-update schemes, thereby facilitating the extension of the real-time query processing to large dynamic graphs. Using our rich research experience in relevance search over graphs, attributed graph analysis, and network embedding as a backing, we expect this project to create a suite of techniques and tools for improved ranking and recommendation services in the industry.
Effective start/end date1/01/2431/12/26


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.