Abstract
Cluster discovery is an essential part of many data mining applications. While cluster discovery process is mainly unsupervised in nature, it can often be aided by a small amount of labeled data. A probabilistic model on the clustering structure is adopted and a novel unified energy equation for clustering that incorporates both labeled data and unlabeled data is introduced. This formulation is inspired by a force-field model integrating labeling constraint on labeled data and similarity information on unlabeled data for joint estimation. Experimental results show that good clusters can be identified using small amount of labeled data.
Original language | English |
---|---|
Pages (from-to) | 37-46 |
Number of pages | 10 |
Journal | Applied Intelligence |
Volume | 22 |
Issue number | 1 |
DOIs | |
Publication status | Published - Jan 2005 |
Externally published | Yes |
Scopus Subject Areas
- Artificial Intelligence
User-Defined Keywords
- Clustering semi-supervised learning
- Markov model