Top-k Graph Summarization on Hierarchical DAGs

Xuliang Zhu, Xin Huang*, Koon Kau Choi, Jianliang Xu

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

5 Citations (Scopus)


Directed acyclic graph (DAG) is an essentially important model to represent terminologies and their hierarchical relationships, such as Disease Ontology. Due to massive terminologies and complex structures in a large DAG, it is challenging to summarize the whole hierarchical DAG. In this paper, we study a new problem of finding k representative vertices to summarize a hierarchical DAG. To depict diverse summarization and important vertices, we design a summary score function for capturing vertices' diversity coverage and structure correlation. The studied problem is theoretically proven to be NP-hard. To efficiently tackle it, we propose a greedy algorithm with an approximation guarantee, which iteratively adds vertices with the large summary contributions into answers. To further improve answer quality, we propose a subtree extraction based method, which is proven to guarantee achieving higher-quality answers. In addition, we develop a scalable algorithm k-PCGS based on candidate pruning and DAG compression for large-scale hierarchical DAGs. Extensive experiments on large real-world datasets demonstrate both the effectiveness and efficiency of proposed algorithms.

Original languageEnglish
Title of host publicationCIKM 2020 - Proceedings of the 29th ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery (ACM)
Number of pages10
ISBN (Electronic)9781450368599
Publication statusPublished - 19 Oct 2020
Event29th ACM International Conference on Information and Knowledge Management, CIKM 2020 - Virtual, Online, Ireland
Duration: 19 Oct 202023 Oct 2020

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings


Conference29th ACM International Conference on Information and Knowledge Management, CIKM 2020
CityVirtual, Online

Scopus Subject Areas

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

User-Defined Keywords

  • data summarization
  • directed acyclic graph
  • hierarchy


Dive into the research topics of 'Top-k Graph Summarization on Hierarchical DAGs'. Together they form a unique fingerprint.

Cite this