TY - JOUR
T1 - Partial label feature selection via label disambiguation and neighborhood mutual information
AU - Ding, Jinfei
AU - Qian, Wenbin
AU - Li, Yihui
AU - Yang, Wenji
AU - Huang, Jintao
N1 - This work is supported by the National Natural Science Foundation of China (No. 62366019 and No. 62366018), the Natural Science Foundation of Jiangxi Province, China (No. 20224BAB202020 and No. 20224BAB202015), the National Key Research and Development Program of China (No. 2022YFD1600202).
Publisher Copyright:
© 2024 Elsevier Inc.
PY - 2024/10
Y1 - 2024/10
N2 - Partial label learning aims to learn from training instances, each of which is associated with a set of candidate labels but only one is a ground-truth label. Feature selection is an effective method to improve the generalization capability of the learning model; however, partial label feature selection work is exceptionally challenging due to the limitation and ambiguity of label information. Therefore, this paper proposes a partial label feature selection algorithm based on label disambiguation and neighborhood mutual information. Firstly, neighborhood granularity is utilized to determine the neighborhoods of instances to disambiguate the candidate labels. Secondly, based on label confidence induced by disambiguation, feature relevance and redundancy are measured by neighborhood mutual information, which avoids the negative impact of data discretization on feature selection and directly handles continuous features. Concurrently, the kappa coefficient is employed to estimate the label consistency for describing the influences of feature changes on the label space. Then, the significance of each feature is evaluated by fusing feature relevance, feature redundancy, and label consistency. Finally, the effectiveness of the proposed algorithm is verified by comparing the proposed algorithm with four base classifiers and other feature selection methods. Furthermore, the feasibility of the proposed disambiguation method is demonstrated through comparison with four state-of-the-art disambiguation methods.
AB - Partial label learning aims to learn from training instances, each of which is associated with a set of candidate labels but only one is a ground-truth label. Feature selection is an effective method to improve the generalization capability of the learning model; however, partial label feature selection work is exceptionally challenging due to the limitation and ambiguity of label information. Therefore, this paper proposes a partial label feature selection algorithm based on label disambiguation and neighborhood mutual information. Firstly, neighborhood granularity is utilized to determine the neighborhoods of instances to disambiguate the candidate labels. Secondly, based on label confidence induced by disambiguation, feature relevance and redundancy are measured by neighborhood mutual information, which avoids the negative impact of data discretization on feature selection and directly handles continuous features. Concurrently, the kappa coefficient is employed to estimate the label consistency for describing the influences of feature changes on the label space. Then, the significance of each feature is evaluated by fusing feature relevance, feature redundancy, and label consistency. Finally, the effectiveness of the proposed algorithm is verified by comparing the proposed algorithm with four base classifiers and other feature selection methods. Furthermore, the feasibility of the proposed disambiguation method is demonstrated through comparison with four state-of-the-art disambiguation methods.
KW - Feature selection
KW - Granular computing
KW - Label disambiguation
KW - Neighborhood mutual information
KW - Partail label learning
UR - http://www.scopus.com/inward/record.url?scp=85198603083&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2024.121163
DO - 10.1016/j.ins.2024.121163
M3 - Journal article
AN - SCOPUS:85198603083
SN - 0020-0255
VL - 680
JO - Information Sciences
JF - Information Sciences
M1 - 121163
ER -