k-means clustering with outlier removal

Guojun Gan*, Kwok Po NG

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

192 Citations (Scopus)

Abstract

Outlier detection is an important data analysis task in its own right and removing the outliers from clusters can improve the clustering accuracy. In this paper, we extend the k-means algorithm to provide data clustering and outlier detection simultaneously by introducing an additional “cluster” to the k-means algorithm to hold all outliers. We design an iterative procedure to optimize the objective function of the proposed algorithm and establish the convergence of the iterative procedure. Numerical experiments on both synthetic data and real data are provided to demonstrate the effectiveness and efficiency of the proposed algorithm.

Original languageEnglish
Pages (from-to)8-14
Number of pages7
JournalPattern Recognition Letters
Volume90
DOIs
Publication statusPublished - 15 Apr 2017

Scopus Subject Areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

User-Defined Keywords

  • Data clustering
  • k-means
  • Outlier detection

Fingerprint

Dive into the research topics of 'k-means clustering with outlier removal'. Together they form a unique fingerprint.

Cite this