Abstract
Outlier detection is an important data analysis task in its own right and removing the outliers from clusters can improve the clustering accuracy. In this paper, we extend the k-means algorithm to provide data clustering and outlier detection simultaneously by introducing an additional “cluster” to the k-means algorithm to hold all outliers. We design an iterative procedure to optimize the objective function of the proposed algorithm and establish the convergence of the iterative procedure. Numerical experiments on both synthetic data and real data are provided to demonstrate the effectiveness and efficiency of the proposed algorithm.
Original language | English |
---|---|
Pages (from-to) | 8-14 |
Number of pages | 7 |
Journal | Pattern Recognition Letters |
Volume | 90 |
DOIs | |
Publication status | Published - 15 Apr 2017 |
Scopus Subject Areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence
User-Defined Keywords
- Data clustering
- k-means
- Outlier detection