Abstract
Hierarchical clustering has been extensively applied for data analysis and knowledge discovery. However, the scalability of hierarchical clustering methods is generally limited due to their time complexity of O(n 2 ), where n is the size of the input data. To address this issue, we present a fast and accurate hierarchical clustering algorithm based on topology training. Specifically, a trained multilayer topological structure that fits the spatial distribution of the data is utilized to accelerate the similarity measurement, which dominates the computational cost in hierarchical clustering. Moreover, the topological structure also guides the merging steps in hierarchical clustering to form a meaningful and accurate clustering result. In addition, an incremental version of the proposed algorithm is further designed so that the proposed approach is applicable to the streaming data as well. Promising experimental results on various data sets demonstrate the efficiency and effectiveness of the proposed algorithms.
Original language | English |
---|---|
Article number | 8423698 |
Pages (from-to) | 876-890 |
Number of pages | 15 |
Journal | IEEE Transactions on Neural Networks and Learning Systems |
Volume | 30 |
Issue number | 3 |
DOIs | |
Publication status | Published - Mar 2019 |
Scopus Subject Areas
- Software
- Computer Science Applications
- Computer Networks and Communications
- Artificial Intelligence
User-Defined Keywords
- Data analysis
- hierarchical clustering
- incremental algorithm
- time complexity
- topology