Abstract
Biomedical big data are usually high dimensional and collected in the form of a continuous influx of new features. Online Feature Selection (OFS) is a promising way to manage and analyze such data, as OFS circumvents the huge computation cost brought by simultaneously considering all the features, and can also dynamically maintain a distribution-fitting feature subset on the fly. However, almost all the OFS solutions are based on a naive premise that all features are of the same type, overlooking the fact that real biomedical data set usually consists of heterogeneous numerical and categorical features. This paper therefore proposes a new approach to Online Heterogeneous Feature Selection (OHFS), which dynamically maintains a feature subset that maximizes the number of neighborhood sets where all the objects within each neighborhood set are of the same class. To appropriately partition the objects into neighborhood sets, a density-guided relation is proposed, which adaptively forms non-overlapping neighborhood sets by detecting spatially compact objects. A unified density measure is also presented to avoid information loss in processing heterogeneous features. It turns out that the proposed approach features parameter- free, interpretability, and efficiency. It is capable of maintaining a concise feature subset while receiving any type of feature. Extensive experimental evaluations demonstrate its superiority.
Original language | English |
---|---|
Title of host publication | Proceedings - 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 |
Editors | Xingpeng Jiang, Haiying Wang, Reda Alhajj, Xiaohua Hu, Felix Engel, Mufti Mahmud, Nadia Pisanti, Xuefeng Cui, Hong Song |
Publisher | IEEE |
Pages | 771-778 |
Number of pages | 8 |
ISBN (Electronic) | 9798350337488 |
ISBN (Print) | 9798350337495 |
DOIs | |
Publication status | Published - 5 Dec 2023 |
Event | 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 - Istanbul, Turkey Duration: 5 Dec 2023 → 8 Dec 2023 https://ieeexplore.ieee.org/xpl/conhome/10385250/proceeding |
Publication series
Name | Proceedings - IEEE International Conference on Bioinformatics and Biomedicine, BIBM |
---|---|
ISSN (Print) | 2156-1125 |
ISSN (Electronic) | 2156-1133 |
Conference
Conference | 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 |
---|---|
Country/Territory | Turkey |
City | Istanbul |
Period | 5/12/23 → 8/12/23 |
Internet address |
Scopus Subject Areas
- Artificial Intelligence
- Computer Science Applications
- Computer Vision and Pattern Recognition
- Automotive Engineering
- Modelling and Simulation
- Health Informatics
User-Defined Keywords
- density measure
- distance metric
- heterogeneous features
- online feature selection
- supervised learning