K-means-type algorithms on distributed memory computer

M. K. Ng*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

8 Citations (Scopus)

Abstract

Partitioning a set of objects into homogeneous clusters is a fundamental operation in data mining. The k-means-type algorithm is best suited for implementing this operation because of its efficiency in clustering large numerical and categorical data sets. An efficient parallel k-means-type algorithm for clustering data sets on a distributed share-nothing parallel system is considered. It has a simple communication scheme which performs only one round of information exchange in every iteration. We show that the speedup of our algorithm is asymptotically linear when the number of objects is sufficiently large. We implement the parallel k-means-type algorithm on an IBM SP2 parallel machine. The performance studies show that the algorithm has nice parallelism in experiments.

Original languageEnglish
Pages (from-to)75-91
Number of pages17
JournalInternational Journal of High Speed Computing
Volume11
Issue number2
DOIs
Publication statusPublished - Jun 2000

Scopus Subject Areas

  • Theoretical Computer Science
  • Computational Theory and Mathematics

User-Defined Keywords

  • Clustering
  • Data mining
  • K-means-type algorithm
  • Parallel algorithms

Fingerprint

Dive into the research topics of 'K-means-type algorithms on distributed memory computer'. Together they form a unique fingerprint.

Cite this