Top-k Graph Summarization on Hierarchical DAGs

Xuliang Zhu, Xin Huang*, Koon Kau Choi, Jianliang Xu

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

5 Citations (Scopus)

Abstract

Directed acyclic graph (DAG) is an essentially important model to represent terminologies and their hierarchical relationships, such as Disease Ontology. Due to massive terminologies and complex structures in a large DAG, it is challenging to summarize the whole hierarchical DAG. In this paper, we study a new problem of finding k representative vertices to summarize a hierarchical DAG. To depict diverse summarization and important vertices, we design a summary score function for capturing vertices' diversity coverage and structure correlation. The studied problem is theoretically proven to be NP-hard. To efficiently tackle it, we propose a greedy algorithm with an approximation guarantee, which iteratively adds vertices with the large summary contributions into answers. To further improve answer quality, we propose a subtree extraction based method, which is proven to guarantee achieving higher-quality answers. In addition, we develop a scalable algorithm k-PCGS based on candidate pruning and DAG compression for large-scale hierarchical DAGs. Extensive experiments on large real-world datasets demonstrate both the effectiveness and efficiency of proposed algorithms.

Original languageEnglish
Title of host publicationCIKM 2020 - Proceedings of the 29th ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery (ACM)
Pages1903-1912
Number of pages10
ISBN (Electronic)9781450368599
DOIs
Publication statusPublished - 19 Oct 2020
Event29th ACM International Conference on Information and Knowledge Management, CIKM 2020 - Virtual, Online, Ireland
Duration: 19 Oct 202023 Oct 2020

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

Conference29th ACM International Conference on Information and Knowledge Management, CIKM 2020
Country/TerritoryIreland
CityVirtual, Online
Period19/10/2023/10/20

Scopus Subject Areas

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

User-Defined Keywords

  • data summarization
  • directed acyclic graph
  • hierarchy

Fingerprint

Dive into the research topics of 'Top-k Graph Summarization on Hierarchical DAGs'. Together they form a unique fingerprint.

Cite this