TY - JOUR
T1 - Neighborhood combination entropy-based label distribution feature selection with instance similarity and feature redundancy
AU - Li, Kewen
AU - Qian, Wenbin
AU - Cai, Xingxing
AU - Huang, Jintao
N1 - Funding information:
This work is supported by the National Natural Science Foundation of China (No. 62366019 and No. 61966016), Jiangxi Provincial Natural Science Foundation, China (No. 20242BAB23014), and the National Key Research and Development Program of China (No. 2024YFF1307305)
Publisher copyright:
© 2025 Elsevier Inc.
PY - 2025/12/15
Y1 - 2025/12/15
N2 - Feature selection remains a critical preprocessing step in multi-label learning to address the inherent challenges of high dimensionality. Prevailing methodologies predominantly rely on logical labels, often overlooking quantifiable label significance and instance-specific label relevance, which constitutes a major limitation. To tackle the issue, a neighborhood combination entropy-based label distribution feature selection approach with instance similarity and feature redundancy is presented. Initially, label enhancement is achieved through the integration of label weights and conditional probabilities, generating label distributions that capture instance correlations and label dependencies. Subsequently, a neighborhood combination entropy measure of multi-label data is proposed through the construction of neighborhood structures using instance similarity, enabling uncertainty quantification. Furthermore, feature significance is quantified via neighborhood combination mutual information, concurrently maximizing feature-label relevance while minimizing feature-feature redundancy, thereby establishing a feature ranking methodology. Finally, extensive experiments on thirteen benchmark datasets verify that the proposed algorithm is superior to six state-of-the-art approaches in terms of six evaluation metrics.
AB - Feature selection remains a critical preprocessing step in multi-label learning to address the inherent challenges of high dimensionality. Prevailing methodologies predominantly rely on logical labels, often overlooking quantifiable label significance and instance-specific label relevance, which constitutes a major limitation. To tackle the issue, a neighborhood combination entropy-based label distribution feature selection approach with instance similarity and feature redundancy is presented. Initially, label enhancement is achieved through the integration of label weights and conditional probabilities, generating label distributions that capture instance correlations and label dependencies. Subsequently, a neighborhood combination entropy measure of multi-label data is proposed through the construction of neighborhood structures using instance similarity, enabling uncertainty quantification. Furthermore, feature significance is quantified via neighborhood combination mutual information, concurrently maximizing feature-label relevance while minimizing feature-feature redundancy, thereby establishing a feature ranking methodology. Finally, extensive experiments on thirteen benchmark datasets verify that the proposed algorithm is superior to six state-of-the-art approaches in terms of six evaluation metrics.
KW - Feature selection
KW - Neighborhood rough sets
KW - Multi-label learning
KW - Label distribution
KW - mutual information
U2 - 10.1016/j.ins.2025.122999
DO - 10.1016/j.ins.2025.122999
M3 - Journal article
SN - 0020-0255
JO - Information Sciences
JF - Information Sciences
M1 - 122999
ER -