Recent studies have suggested that extremely low dimensional projected clusters exist in real datasets. Here, we propose a new algorithm for identifying them. It combines object clustering and dimension selection, and allows the input of domain knowledge in guiding the clustering process. Theoretical and experimental results show that even a small amount of input knowledge could already help detect clusters with only 1% of the relevant dimensions. We also show that this semi-supervised algorithm can perform knowledge-guided selective clustering when there are multiple meaningful object groupings. The algorithm is also shown effective in analysing a microarray dataset.
|Number of pages||31|
|Journal||International Journal of Data Mining and Bioinformatics|
|Publication status||Published - 2009|
Scopus Subject Areas
- Information Systems
- Biochemistry, Genetics and Molecular Biology(all)
- Library and Information Sciences