Abstract
Clustering over a graph seeks to partition the nodes therein into disjoint groups such that nodes within the same cluster are tightly-knit, while those across clusters are distant from each other. In practice, graphs are often attended with rich attributes, which are termed attributed graphs. By leveraging the complementary nature of graph topology and node attributes in such graphs, graph neural networks (GNNs) have obtained encouraging performance in graph clustering. However, existing GNN-based approaches strongly rely on the homophilic assumption of the input graph, and thus, largely fail on heterophilic graphs and others embodying numerous missing or noisy links, which are widely present in real life.
To bridge this gap, this paper presents DGAC, an effective graph-agnostic solution for graph clustering. Particularly, DGAC overcomes the limitations of prior works by exploiting the high-order connectivity of nodes within not only the input graph G but also the affinity graph H underlying the attribute data. To achieve this goal, we first unify the embedding and clustering generations into a coherent framework optimizing the Dirichlet Energy on both G and H. Based thereon, theoretically-grounded solvers are developed for efficient constructions of the embeddings and clusters via graph diffusion operations, which aggregate features from specific neighbors, enabling the capture of high-order semantics from G or H. On top of that, DGAC includes three training loss functions that facilitate effective feature extraction and clustering. Extensive experiments, comparing DGAC against 12 baselines over 12 homophilic or heterophilic graph datasets, showcase that DGAC consistently and considerably outperforms all competitors in terms of clustering quality measured against ground truth labels.
To bridge this gap, this paper presents DGAC, an effective graph-agnostic solution for graph clustering. Particularly, DGAC overcomes the limitations of prior works by exploiting the high-order connectivity of nodes within not only the input graph G but also the affinity graph H underlying the attribute data. To achieve this goal, we first unify the embedding and clustering generations into a coherent framework optimizing the Dirichlet Energy on both G and H. Based thereon, theoretically-grounded solvers are developed for efficient constructions of the embeddings and clusters via graph diffusion operations, which aggregate features from specific neighbors, enabling the capture of high-order semantics from G or H. On top of that, DGAC includes three training loss functions that facilitate effective feature extraction and clustering. Extensive experiments, comparing DGAC against 12 baselines over 12 homophilic or heterophilic graph datasets, showcase that DGAC consistently and considerably outperforms all competitors in terms of clustering quality measured against ground truth labels.
Original language | English |
---|---|
Title of host publication | Proceedings of the ACM on Web Conference 2025 |
Place of Publication | New York |
Publisher | Association for Computing Machinery (ACM) |
Pages | 1353-1364 |
Number of pages | 12 |
ISBN (Print) | 9798400712746 |
DOIs | |
Publication status | Published - 22 Apr 2025 |
Event | The ACM Web Conference, WWW 2025 - International Convention & Exhibition Centre, Sydney, Australia Duration: 28 Apr 2025 → 2 May 2025 https://www2025.thewebconf.org/ (Conference website) https://dl.acm.org/doi/proceedings/10.1145/3696410 (Conference proceedings) |
Publication series
Name | Proceedings of the ACM on Web Conference 2025 |
---|---|
Publisher | Association for Computing Machinery |
Conference
Conference | The ACM Web Conference, WWW 2025 |
---|---|
Abbreviated title | WWW '25 |
Country/Territory | Australia |
City | Sydney |
Period | 28/04/25 → 2/05/25 |
Internet address |
|
User-Defined Keywords
- Graph clustering
- Dirichlet Energy
- affinity graph
- graph neural network
- heterophily graph clustering
- contrastive learning