Efficient Core Propagation based Hierarchical Graph Clustering

Jinbin Huang, Zihan Jia, Xin Huang*

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

Abstract

Communities, formed by a subset of vertices that are densely connected to each other and loosely connected to outside community members, widely exist to represent functional modules in real-world complex systems. Most existing community detection and search methods aim at finding communities at one single level, neglecting the natural properties of overlapping and hierarchy in communities. Therefore, the discovery of hierarchical graph clustering (HGC) to find communities at different levels, which is particularly useful in many applications. However, existing HGC studies suffer from two significant limitations: 1) inefficiency over large-scale networks, and 2) generating too many levels of community hierarchy without distinguishing the hierarchy differences. To address the above limitations, we revisit the problem of hierarchical graph clustering and formulate the problem based on our proposed three important properties. To tackle it, we propose theoretical-guaranteed fast solutions, in terms of algorithm complexity and hierarchy levels. We first formulate our HGC problem to admit three key properties of hierarchical communities. Based on the natural hierarchical structure of k-core, we develop a simple and importantly useful technique of core propagation. The key idea of core propagation is to take each k-core as one seed of hierarchical communities and find disjoint communities within k-core using a linear-time algorithm of label propagation. We propose two core propagation approaches of top-down and bottom-up algorithms, in terms of different search directions of k-cores by increment and decrement on k, respectively. The top-down method can find a given level of hierarchical communities in O(tm) time, where t is an input of hierarchy levels and m is the graph size. To dismiss the hardness of users' input hierarchy parameter t, the bottom-up algorithm is equipped with a well-designed strategy of auto-adjusting hierarchical levels based on the graph structure itself. We also develop the coreness weight-based label propagation to ensure the accurate label voting of compressed communities at low levels. The bottom-up method runs fast in O(m log δ(G)), where log δ(G) is a small value of the maximum coreness in graph G. Extensive experiments conducted on real-world graphs with ground-truth HGCs validate the effectiveness and efficiency of our proposed core propagation methods against state-of-the-art methods. Two case studies on the world-wide flight network and the Hong Kong road network demonstrate the particular usage of our HGC methods.

Original languageEnglish
Title of host publicationProceedings - 2025 IEEE 41st International Conference on Data Engineering, ICDE 2025
PublisherIEEE
Pages3904-3916
Number of pages13
ISBN (Electronic)9798331536039
ISBN (Print)9798331536046
DOIs
Publication statusPublished - 20 May 2025
Event41st IEEE International Conference on Data Engineering, ICDE 2025 - The Hong Kong Polytechnic University, Hong Kong, China
Duration: 19 May 202523 May 2025
https://ieee-icde.org/2025/
https://ieee-icde.org/2025/research-papers/
https://www.computer.org/csdl/proceedings/icde/2025/26FZy3xczFS

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Print)1063-6382
ISSN (Electronic)2375-026X

Conference

Conference41st IEEE International Conference on Data Engineering, ICDE 2025
Abbreviated titleICDE 2025
Country/TerritoryChina
CityHong Kong
Period19/05/2523/05/25
Internet address

User-Defined Keywords

  • core propagation
  • hierarchical graph clustering

Fingerprint

Dive into the research topics of 'Efficient Core Propagation based Hierarchical Graph Clustering'. Together they form a unique fingerprint.

Cite this