Selecting Heterogeneous Features Based on Unified Density-Guided Neighborhood Relation for Complex Biomedical Data Analysis

Lang Zhao, Yiqun Zhang*, Xiaopeng Luo, Yue Zhang, Yiu Ming Cheung, Kangshun Li

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

Abstract

Biomedical big data are usually high dimensional and collected in the form of a continuous influx of new features. Online Feature Selection (OFS) is a promising way to manage and analyze such data, as OFS circumvents the huge computation cost brought by simultaneously considering all the features, and can also dynamically maintain a distribution-fitting feature subset on the fly. However, almost all the OFS solutions are based on a naive premise that all features are of the same type, overlooking the fact that real biomedical data set usually consists of heterogeneous numerical and categorical features. This paper therefore proposes a new approach to Online Heterogeneous Feature Selection (OHFS), which dynamically maintains a feature subset that maximizes the number of neighborhood sets where all the objects within each neighborhood set are of the same class. To appropriately partition the objects into neighborhood sets, a density-guided relation is proposed, which adaptively forms non-overlapping neighborhood sets by detecting spatially compact objects. A unified density measure is also presented to avoid information loss in processing heterogeneous features. It turns out that the proposed approach features parameter- free, interpretability, and efficiency. It is capable of maintaining a concise feature subset while receiving any type of feature. Extensive experimental evaluations demonstrate its superiority.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023
EditorsXingpeng Jiang, Haiying Wang, Reda Alhajj, Xiaohua Hu, Felix Engel, Mufti Mahmud, Nadia Pisanti, Xuefeng Cui, Hong Song
PublisherIEEE
Pages771-778
Number of pages8
ISBN (Electronic)9798350337488
ISBN (Print)9798350337495
DOIs
Publication statusPublished - 5 Dec 2023
Event2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 - Istanbul, Turkey
Duration: 5 Dec 20238 Dec 2023
https://ieeexplore.ieee.org/xpl/conhome/10385250/proceeding

Publication series

NameProceedings - IEEE International Conference on Bioinformatics and Biomedicine, BIBM
ISSN (Print)2156-1125
ISSN (Electronic)2156-1133

Conference

Conference2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023
Country/TerritoryTurkey
CityIstanbul
Period5/12/238/12/23
Internet address

Scopus Subject Areas

  • Artificial Intelligence
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Automotive Engineering
  • Modelling and Simulation
  • Health Informatics

User-Defined Keywords

  • density measure
  • distance metric
  • heterogeneous features
  • online feature selection
  • supervised learning

Fingerprint

Dive into the research topics of 'Selecting Heterogeneous Features Based on Unified Density-Guided Neighborhood Relation for Complex Biomedical Data Analysis'. Together they form a unique fingerprint.

Cite this