Abstract
Understanding categorical data with vague qualitative values by forming clusters is crucial in many data-driven AI fields. Compared with numerical data with its quantitative values embedded in well-defined Euclidean distance space, distances of the qualitative values are naturally unknown and are specially defined for certain data types or tasks. This paper, therefore, proposes a distance metric space fusion framework, which learns to fuse multiple distance metrics to form a statistical information-complete and prior knowledge-comprehensive metric for robust and accurate cluster analysis of qualitative data. To better serve various clustering tasks, the metric fusion objective is incorporated into the clustering objective through iterative learning. It turns out that the proposed method stably demonstrates superiority on various challenging real benchmark datasets. Extensive experiments including significance tests, ablation studies, etc. validate its efficacy. Source code of the proposed method is available at https://github.com/Sen-Feng/ICASSP-MSF/tree/main/CODE.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Editors | Bhaskar D Rao, Isabel Trancoso, Gaurav Sharma, Neelesh B. Mehta |
Publisher | IEEE |
Number of pages | 5 |
ISBN (Electronic) | 9798350368741 |
ISBN (Print) | 9798350368758 |
DOIs | |
Publication status | Published - 6 Apr 2025 |
Event | 2025 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2025 - Hyderabad, India Duration: 6 Apr 2025 → 11 Apr 2025 https://ieeexplore.ieee.org/xpl/conhome/10887540/proceeding |
Publication series
Name | Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
---|---|
Publisher | IEEE |
Conference
Conference | 2025 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2025 |
---|---|
Country/Territory | India |
City | Hyderabad |
Period | 6/04/25 → 11/04/25 |
Internet address |
User-Defined Keywords
- Cluster analysis
- categorical data
- unsupervised learning
- metric space learning
- robust and accurate clustering