K-means — A generalized k-means clustering algorithm with unknown cluster number

Yiu Ming Cheung*

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

10 Citations (Scopus)

Abstract

This paper presents a new clustering technique named STep-wise Automatic Rival-penalized (STAR) k-means algorithm (denoted as k-means), which is actually a generalized version of the conventional k-means (MacQueen 1967). Not only is this new algorithm applicable to ellipse-shaped data clusters rather than just to ball-shaped ones like the k-means algorithm, but also it can perform appropriate clustering without knowing cluster number by gradually penalizing the winning chance of those extra seed points during learning competition. Although the existing RPCL (Xu et al. 1993) can automatically select the cluster number as well by driving extra seed points far away from the input data set, its performance is much sensitive to the selection of the de-learning rate. To our best knowledge, there is still no theoretical result to guide its selection as yet. In contrast, the proposed k-means algorithm need not determine this rate. We have qualitatively analyzed its rival-penalized mechanism with the results well-justified by the experiments.

Original languageEnglish
Title of host publicationIntelligent Data Engineering and Automated Learning - IDEAL 2002 - 3rd International Conference, Proceedings
EditorsHujun Yin, Nigel Allinson, Richard Freeman, John Keane, Simon Hubbard
PublisherSpringer Verlag
Pages307-317
Number of pages11
ISBN (Print)9783540440253
DOIs
Publication statusPublished - 2002
Event3rd International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2002 - Manchester, United Kingdom
Duration: 12 Aug 200214 Aug 2002

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2412
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2002
Country/TerritoryUnited Kingdom
CityManchester
Period12/08/0214/08/02

Scopus Subject Areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'K-means — A generalized k-means clustering algorithm with unknown cluster number'. Together they form a unique fingerprint.

Cite this