Feature selection via label enhancement and neighborhood rough set for multi-label data with unbalanced distribution

Wenbin Qian*, Wenyong Ruan, Xiwen Lu, Wenji Yang, Jintao Huang

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

Multi-label learning has gained significant attention in classification tasks, but challenges remain in handling high-dimensional data. Although feature selection techniques can alleviate these issues, neglecting the unbalanced data distribution problem severely undermines the models’ accuracy. Furthermore, existing methods fail to account for the importance and correlation of labels. In this paper, we present a novel multi-label feature selection algorithm that addresses these issues through three innovations: (1) using k-nearest neighbors to capture local similarities in unbalanced data, (2) enhancing labels by converting them into distributions to enrich semantic information, and (3) introducing a new evaluation function to assess label correlations. A multi-criteria strategy is established to maximize feature-label relevance, minimize redundancy, and strengthen label correlations. Experimental results on fifteen multi-label datasets demonstrate the algorithm’s superiority over five state-of-the-art methods.
Original languageEnglish
Article number113028
Number of pages19
JournalApplied Soft Computing
Volume175
DOIs
Publication statusPublished - May 2025

User-Defined Keywords

  • Feature selection
  • Label distribution
  • Label enhancement
  • Multi-label learning
  • Neighborhood rough set

Fingerprint

Dive into the research topics of 'Feature selection via label enhancement and neighborhood rough set for multi-label data with unbalanced distribution'. Together they form a unique fingerprint.

Cite this