Hierarchical Information-Theoretic Co-Clustering for high dimensional data

Yuanyuan Wang*, Yunming Ye, Xutao Li, Kwok Po NG, Joshua Huang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Hierarchical clustering is an important technique for hierarchical data exploration applications. However, most existing hierarchial methods are based on traditional one-side clustering, which is not effective for handling high dimensional data. In this paper, we develop a partitional hierarchical co-clustering framework and propose a Hierarchical Information-Theoretical Co-Clustering (HITCC) algorithm. The algorithm conducts a series of binary partitions of objects on a data set via the Information- Theoretical Co-Clustering (ITCC) procedure, and generates a hierarchical management of object clusters. Due to simultaneously clustering of features and objects in the process of building a cluster tree, the HITCC algorithm can identify subspace clusters at different-level abstractions and acquire good clustering hierarchies. Compared with the flat ITCC algorithm and six state-of-the-art hierarchical clustering algorithms on various data sets, the new algorithm demonstrated much better performance. ICIC International

Original languageEnglish
Pages (from-to)487-500
Number of pages14
JournalInternational Journal of Innovative Computing, Information and Control
Volume7
Issue number1
Publication statusPublished - Jan 2011

Scopus Subject Areas

  • Software
  • Theoretical Computer Science
  • Information Systems
  • Computational Theory and Mathematics

User-Defined Keywords

  • Co-clustering
  • Hierarchical clustering
  • Text clustering

Fingerprint

Dive into the research topics of 'Hierarchical Information-Theoretic Co-Clustering for high dimensional data'. Together they form a unique fingerprint.

Cite this