TY - JOUR
T1 - Semi-Supervised Ensemble Clustering Based on Selected Constraint Projection
AU - Yu, Zhiwen
AU - Luo, Peinan
AU - LIU, Jiming
AU - Wong, Hau San
AU - You, Jane
AU - Han, Guoqiang
AU - Zhang, Jun
N1 - Funding Information:
The work described in this paper was partially funded by grants from the NSFC No. 61722205, No. 61751205, No. 61572199, No. 61572540, No. 61472145, and No. U1611461, the grant from the Guangdong Natural Science Funds ( No. 2017A030312008), the grant from the Science and Technology Planning Project of Guangdong Province, China (No. 2015A050502011, No. 2016B090918042, No. 2016A050503015, No. 2016B010127003), the grant from the Guangzhou Science and Technology Planning Project (No. 201704030051), the grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (No. CityU 11300715), the grant from the Hong Kong General Research Grant(GRF, Ref. No. 152202/14E) and the PolyU Central Research Grant (G-YBJW), and a grant from the City University of Hong Kong [Project No. 7004884].
PY - 2018/12/1
Y1 - 2018/12/1
N2 - Traditional cluster ensemble approaches have several limitations. (1) Few make use of prior knowledge provided by experts. (2) It is difficult to achieve good performance in high-dimensional datasets. (3) All of the weight values of the ensemble members are equal, which ignores different contributions from different ensemble members. (4) Not all pairwise constraints contribute to the final result. In the face of this situation, we propose double weighting semi-supervised ensemble clustering based on selected constraint projection(DCECP) which applies constraint weighting and ensemble member weighting to address these limitations. Specifically, DCECP first adopts the random subspace technique in combination with the constraint projection procedure to handle high-dimensional datasets. Second, it treats prior knowledge of experts as pairwise constraints, and assigns different subsets of pairwise constraints to different ensemble members. An adaptive ensemble member weighting process is designed to associate different weight values with different ensemble members. Third, the weighted normalized cut algorithm is adopted to summarize clustering solutions and generate the final result. Finally, nonparametric statistical tests are used to compare multiple algorithms on real-world datasets. Our experiments on 15 high-dimensional datasets show that DCECP performs better than most clustering algorithms.
AB - Traditional cluster ensemble approaches have several limitations. (1) Few make use of prior knowledge provided by experts. (2) It is difficult to achieve good performance in high-dimensional datasets. (3) All of the weight values of the ensemble members are equal, which ignores different contributions from different ensemble members. (4) Not all pairwise constraints contribute to the final result. In the face of this situation, we propose double weighting semi-supervised ensemble clustering based on selected constraint projection(DCECP) which applies constraint weighting and ensemble member weighting to address these limitations. Specifically, DCECP first adopts the random subspace technique in combination with the constraint projection procedure to handle high-dimensional datasets. Second, it treats prior knowledge of experts as pairwise constraints, and assigns different subsets of pairwise constraints to different ensemble members. An adaptive ensemble member weighting process is designed to associate different weight values with different ensemble members. Third, the weighted normalized cut algorithm is adopted to summarize clustering solutions and generate the final result. Finally, nonparametric statistical tests are used to compare multiple algorithms on real-world datasets. Our experiments on 15 high-dimensional datasets show that DCECP performs better than most clustering algorithms.
KW - Cluster ensemble
KW - pairwise constraint
KW - projection
KW - semi-supervised clustering
UR - http://www.scopus.com/inward/record.url?scp=85044398746&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2018.2818729
DO - 10.1109/TKDE.2018.2818729
M3 - Journal article
AN - SCOPUS:85044398746
SN - 1041-4347
VL - 30
SP - 2394
EP - 2407
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 12
M1 - 8323237
ER -