TY - JOUR
T1 - Fast and Accurate Hierarchical Clustering Based on Growing Multilayer Topology Training
AU - Cheung, Yiu Ming
AU - Zhang, Yiqun
N1 - Funding Information:
Manuscript received April 18, 2017; revised November 2, 2017 and April 15, 2018; accepted June 27, 2018. Date of publication July 31, 2018; date of current version February 19, 2019. This work was supported in part by the National Natural Science Foundation of China under Grant 61672444 and Grant 61272366, in part by the SZSTI under Grant JCYJ20160531194006833, and in part by the Faculty Research Grant of Hong Kong Baptist University under Project FRG2/16-17/051 and FRG2/17-18/082. (Corresponding author: Yiu-ming Cheung.) Y. Cheung is with the Department of Computer Science, Hong Kong Baptist University (HKBU), Hong Kong, and also with the HKBU Institute of Research and Continuing Education, Shenzhen 518057, China (e-mail: [email protected]).
PY - 2019/3
Y1 - 2019/3
N2 - Hierarchical clustering has been extensively applied for data analysis and knowledge discovery. However, the scalability of hierarchical clustering methods is generally limited due to their time complexity of O(n 2 ), where n is the size of the input data. To address this issue, we present a fast and accurate hierarchical clustering algorithm based on topology training. Specifically, a trained multilayer topological structure that fits the spatial distribution of the data is utilized to accelerate the similarity measurement, which dominates the computational cost in hierarchical clustering. Moreover, the topological structure also guides the merging steps in hierarchical clustering to form a meaningful and accurate clustering result. In addition, an incremental version of the proposed algorithm is further designed so that the proposed approach is applicable to the streaming data as well. Promising experimental results on various data sets demonstrate the efficiency and effectiveness of the proposed algorithms.
AB - Hierarchical clustering has been extensively applied for data analysis and knowledge discovery. However, the scalability of hierarchical clustering methods is generally limited due to their time complexity of O(n 2 ), where n is the size of the input data. To address this issue, we present a fast and accurate hierarchical clustering algorithm based on topology training. Specifically, a trained multilayer topological structure that fits the spatial distribution of the data is utilized to accelerate the similarity measurement, which dominates the computational cost in hierarchical clustering. Moreover, the topological structure also guides the merging steps in hierarchical clustering to form a meaningful and accurate clustering result. In addition, an incremental version of the proposed algorithm is further designed so that the proposed approach is applicable to the streaming data as well. Promising experimental results on various data sets demonstrate the efficiency and effectiveness of the proposed algorithms.
KW - Data analysis
KW - hierarchical clustering
KW - incremental algorithm
KW - time complexity
KW - topology
UR - http://www.scopus.com/inward/record.url?scp=85050963647&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2018.2853407
DO - 10.1109/TNNLS.2018.2853407
M3 - Journal article
C2 - 30072344
AN - SCOPUS:85050963647
SN - 2162-237X
VL - 30
SP - 876
EP - 890
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 3
M1 - 8423698
ER -