Automated variable weighting in k-means type clustering

Joshua Zhexue Huang*, Michael K. Ng, Hongqiang Rong, Zichen Li

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

712 Citations (Scopus)

Abstract

This paper proposes a k-means type clustering algorithm that can automatically calculate variable weights. A new step is introduced to the k-means clustering process to iteratively update variable weights based on the current partition of data and a formula for weight calculation is proposed. The convergency theorem of the new clustering process is given. The variable weights produced by the algorithm measure the importance of variables in clustering and can be used in variable selection in data mining applications where large and complex real data are often involved. Experimental results on both synthetic and real data have shown that the new algorithm outperformed the standard k-means type algorithms in recovering clusters in data.

Original languageEnglish
Pages (from-to)657-668
Number of pages12
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Volume27
Issue number5
DOIs
Publication statusPublished - May 2005

Scopus Subject Areas

  • Software
  • Computer Vision and Pattern Recognition
  • Computational Theory and Mathematics
  • Artificial Intelligence
  • Applied Mathematics

User-Defined Keywords

  • Clustering
  • Data mining
  • Feature evaluation and selection
  • Mining methods and algorithms

Cite this