Abstract
Clustering is one of the most commonly used techniques for unsupervised data analysis. As real data sets are usually composed of numerical and categorical features that are heterogeneous in nature, the heterogeneity in the distance metric and feature coupling prevents deep representation learning from achieving satisfactory clustering accuracy. Currently, supervised Quaternion Representation Learning (QRL) has achieved remarkable success in efficiently learning informative representations of coupled features from multiple views derived endogenously from the original data. To inherit the advantages of QRL for unsupervised heterogeneous feature representation learning, we propose a deep QRL model that works in an encoder-decoder manner. To ensure that the implicit couplings of heterogeneous feature data can be well characterized by representation learning, a hierarchical coupling encoding strategy is designed to convert the data set into an attributed graph to be the input of QRL. We also integrate the clustering objective into the model training to facilitate a joint optimization of the representation and clustering. Extensive experimental evaluations illustrate the superiority of the proposed Quaternion Graph Representation Learning (QGRL) method in terms of clustering accuracy and robustness to various data sets composed of arbitrary combinations of numerical and categorical features. The source code is opened at https://github.com/Juny-Chen/QGRL.git.
Original language | English |
---|---|
Title of host publication | KDD 2024 - Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining |
Publisher | Association for Computing Machinery (ACM) |
Pages | 297-306 |
Number of pages | 10 |
ISBN (Electronic) | 9798400704901 |
DOIs | |
Publication status | Published - 24 Aug 2024 |
Event | 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024 - Barcelona, Spain Duration: 25 Aug 2024 → 29 Aug 2024 https://kdd2024.kdd.org/ |
Publication series
Name | Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining |
---|---|
ISSN (Print) | 2154-817X |
Conference
Conference | 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024 |
---|---|
Country/Territory | Spain |
City | Barcelona |
Period | 25/08/24 → 29/08/24 |
Internet address |
Scopus Subject Areas
- Software
- Information Systems
User-Defined Keywords
- quaternion representation learning
- graph neural network
- heterogeneous features
- spectral clustering