Abstract
Medical time series of laboratory tests has been collected in electronic health records (EHRs) in many countries. Machine-learning algorithms have been proposed to analyze the condition of patients using these medical records. However, medical time series may be recorded using different laboratory parameters in different datasets. This results in the failure of applying a pretrained model on a test dataset containing a time series of different laboratory parameters. This article proposes to solve this problem with an unsupervised time-series adaptation method that generates time series across laboratory parameters. Specifically, a medical time-series generation network with similarity distillation is developed to reduce the domain gap caused by the difference in laboratory parameters. The relations of different laboratory parameters are analyzed, and the similarity information is distilled to guide the generation of target-domain specific laboratory parameters. To further improve the performance in cross-domain medical applications, a missingness-aware feature extraction network is proposed, where the missingness patterns reflect the health conditions and, thus, serve as auxiliary features for medical analysis. In addition, we also introduce domain-adversarial networks in both feature level and time-series level to enhance the adaptation across domains. Experimental results show that the proposed method achieves good performance on both private and publicly available medical datasets. Ablation studies and distribution visualization are provided to further analyze the properties of the proposed method.
Original language | English |
---|---|
Pages (from-to) | 3394-3407 |
Number of pages | 14 |
Journal | IEEE Transactions on Cybernetics |
Volume | 52 |
Issue number | 5 |
Early online date | 14 Aug 2020 |
DOIs | |
Publication status | Published - May 2022 |
Scopus Subject Areas
- Software
- Information Systems
- Human-Computer Interaction
- Electrical and Electronic Engineering
- Control and Systems Engineering
- Computer Science Applications
User-Defined Keywords
- Medical data
- time series
- unsupervised domain adaptation (UDA)