Clustering categorical data sets using tabu search techniques

Michael K. Ng*, Joyce C. Wong

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

81 Citations (Scopus)

Abstract

Clustering methods partition a set of objects into clusters such that objects in the same cluster are more similar to each other than objects in different clusters according to some defined criteria. The fuzzy k-means-type algorithm is best suited for implementing this clustering operation because of its effectiveness in clustering data sets. However, working only on numeric values limits its use because data sets often contain categorical values. In this paper, we present a tabu search based clustering algorithm, to extend the k-means paradigm to categorical domains, and domains with both numeric and categorical values. Using tabu search based techniques, our algorithm can explore the solution space beyond local optimality in order to aim at finding a global solution of the fuzzy clustering problem. It is found that the clustering results produced by the proposed algorithm are very high in accuracy.

Original languageEnglish
Pages (from-to)2783-2790
Number of pages8
JournalPattern Recognition
Volume35
Issue number12
DOIs
Publication statusPublished - Dec 2002

Scopus Subject Areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

User-Defined Keywords

  • Categorical data
  • Clustering
  • K-means
  • K-modes
  • Numeric data
  • Tabu search

Fingerprint

Dive into the research topics of 'Clustering categorical data sets using tabu search techniques'. Together they form a unique fingerprint.

Cite this