TY - GEN
T1 - Semi-supervised clustering via constrained symmetric non-negative matrix factorization
AU - Jing, Liping
AU - Yu, Jian
AU - ZENG, Tieyong
AU - Zhu, Yan
N1 - Copyright:
Copyright 2013 Elsevier B.V., All rights reserved.
PY - 2012
Y1 - 2012
N2 - Semi-supervised clustering based on pairwise constraints has been very active in recent years. The pairwise constraints consist of must-link and cannot-link. Since different types of constraints provide different information, they should be utilized with different strategies in the learning process. In this paper, we investigate the effect of must-link and cannot-link constraints on non-negative matrix factorization (NMF) and show that they play different roles when guiding the factorization procedure. A new semi-supervised NMF model is then proposed with pairwise constraints penalties. Among them, must-link constraints are used to control the distance of the data in the compressed form, and cannot-link constraints are used to control the encoding factor. Meanwhile, the same penalty strategies are applied on symmetric NMF model to handle the similarity matrix. The proposed two models are implemented by an alternating nonnegative least squares algorithm. We examine the performance of our models on series of real similarity data, and compare them with state-of-the-art, illustrating that the new models provide superior clustering performance.
AB - Semi-supervised clustering based on pairwise constraints has been very active in recent years. The pairwise constraints consist of must-link and cannot-link. Since different types of constraints provide different information, they should be utilized with different strategies in the learning process. In this paper, we investigate the effect of must-link and cannot-link constraints on non-negative matrix factorization (NMF) and show that they play different roles when guiding the factorization procedure. A new semi-supervised NMF model is then proposed with pairwise constraints penalties. Among them, must-link constraints are used to control the distance of the data in the compressed form, and cannot-link constraints are used to control the encoding factor. Meanwhile, the same penalty strategies are applied on symmetric NMF model to handle the similarity matrix. The proposed two models are implemented by an alternating nonnegative least squares algorithm. We examine the performance of our models on series of real similarity data, and compare them with state-of-the-art, illustrating that the new models provide superior clustering performance.
KW - NMF
KW - Pairwise Constraints
KW - Semi-supervised Clustering
KW - Symmetric NMF
UR - http://www.scopus.com/inward/record.url?scp=84871591595&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-35139-6_29
DO - 10.1007/978-3-642-35139-6_29
M3 - Conference proceeding
AN - SCOPUS:84871591595
SN - 9783642351389
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 309
EP - 319
BT - Brain Informatics - International Conference, BI 2012, Proceedings
T2 - 2012 International Conference on Brain Informatics, BI 2012
Y2 - 4 December 2012 through 7 December 2012
ER -