SMART: A subspace clustering algorithm that automatically identifies the appropriate number of clusters

Liping Jing, Junjie Li, Kwok Po NG*, Yiu Ming CHEUNG, Joshua Huang

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

11 Citations (Scopus)

Abstract

This paper presents a subspace κ-means clustering algorithm for high-dimensional data with automatic selection of κ. A new penalty term is introduced to the objective function of the fuzzy κ-means clustering process to enable several clusters to compete for objects, which leads to merging some cluster centres and the identification of the 'true' number of clusters. The algorithm determines the number of clusters in a dataset by adjusting the penalty term factor. A subspace cluster validation index is proposed and employed to verify the subspace clustering results generated by the algorithm. The experimental results from both the synthetic and real data have demonstrated that the algorithm is effective in producing consistent clustering results and the correct number of clusters. Some real datasets are used to demonstrate how the proposed algorithm can determine interesting sub-clusters in the datasets.

Original languageEnglish
Pages (from-to)149-177
Number of pages29
JournalInternational Journal of Data Mining, Modelling and Management
Volume1
Issue number2
DOIs
Publication statusPublished - May 2009

Scopus Subject Areas

  • Management Information Systems
  • Modelling and Simulation
  • Computer Science Applications

User-Defined Keywords

  • Cluster numbers
  • Data mining
  • Subspace clustering
  • Weighting
  • κ-means

Fingerprint

Dive into the research topics of 'SMART: A subspace clustering algorithm that automatically identifies the appropriate number of clusters'. Together they form a unique fingerprint.

Cite this