Abstract
Multi-label learning has gained significant attention in classification tasks, but challenges remain in handling high-dimensional data. Although feature selection techniques can alleviate these issues, neglecting the unbalanced data distribution problem severely undermines the models’ accuracy. Furthermore, existing methods fail to account for the importance and correlation of labels. In this paper, we present a novel multi-label feature selection algorithm that addresses these issues through three innovations: (1) using k-nearest neighbors to capture local similarities in unbalanced data, (2) enhancing labels by converting them into distributions to enrich semantic information, and (3) introducing a new evaluation function to assess label correlations. A multi-criteria strategy is established to maximize feature-label relevance, minimize redundancy, and strengthen label correlations. Experimental results on fifteen multi-label datasets demonstrate the algorithm’s superiority over five state-of-the-art methods.
| Original language | English |
|---|---|
| Article number | 113028 |
| Number of pages | 19 |
| Journal | Applied Soft Computing |
| Volume | 175 |
| DOIs | |
| Publication status | Published - May 2025 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 9 Industry, Innovation, and Infrastructure
User-Defined Keywords
- Feature selection
- Label distribution
- Label enhancement
- Multi-label learning
- Neighborhood rough set
Fingerprint
Dive into the research topics of 'Feature selection via label enhancement and neighborhood rough set for multi-label data with unbalanced distribution'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver