Abstract
The performance of many supervised and unsupervised learning algorithms is very sensitive to the choice of an appropriate distance metric. Previous work in metric learning and adaptation has mostly been focused on classification tasks by making use of class label information. In standard clustering tasks, however, class label information is not available. In order to adapt the metric to improve the clustering results, some background knowledge or side information is needed. One useful type of side information is in the form of pairwise similarity or dissimilarity information. Recently, some novel methods (e.g., the parametric method proposed by Xing et al.) for learning global metrics based on pairwise side information have been shown to demonstrate promising results. In this paper, we propose a nonparametric method, called relaxational metric adaptation (RMA), for the same metric adaptation problem. While RMA is local in the sense that it allows locally adaptive metrics, it is also global because even patterns not in the vicinity can have long-range effects on the metric adaptation process. Experimental results for semi-supervised clustering based on both simulated and real-world data sets show that RMA outperforms Xing et al.'s method under most situations. Besides applying RMA to semi-supervised learning, we have also used it to improve the performance of content-based image retrieval systems through metric adaptation. Experimental results based on two real-world image databases show that RMA significantly outperforms other methods in improving the image retrieval performance.
Original language | English |
---|---|
Pages (from-to) | 1905-1917 |
Number of pages | 13 |
Journal | Pattern Recognition |
Volume | 39 |
Issue number | 10 |
DOIs | |
Publication status | Published - Oct 2006 |
Scopus Subject Areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence
User-Defined Keywords
- Constrained k-means
- Content-based image retrieval
- Distance metric
- Nonparametric method
- Pairwise similarity and dissimilarity
- Semi-supervised clustering
- Side information