Abstract
Clinical time series imputation is recognized as an essential task in clinical data analytics. Most models rely either on strong assumptions regarding the underlying data-generation process or on preservation of only local properties without effective consideration of global dependencies. To advance the state of the art in clinical time series imputation, we participated in the 2019 ICHI Data Analytics Challenge on Missing Data Imputation (DACMI). In this paper, we present our proposed model: Context-Aware Time Series Imputation (CATSI), a novel framework based on a bidirectional LSTM in which patients’ health states are explicitly captured by learning a “global context vector” from the entire clinical time series. The imputations are then produced with reference to the global context vector. We also incorporate a cross-feature imputation component to explore the complex feature correlations. Empirical evaluations demonstrate that CATSI obtains a normalized root mean square deviation (nRMSD) of 0.1998, which is 10.6% better than that of state-of-the-art models. Further experiments on consecutive missing datasets also illustrate the effectiveness of incorporating the global context in the generation of accurate imputations.
Original language | English |
---|---|
Pages (from-to) | 411-426 |
Number of pages | 16 |
Journal | Journal of Healthcare Informatics Research |
Volume | 4 |
Issue number | 4 |
Early online date | 18 Oct 2020 |
DOIs | |
Publication status | Published - Dec 2020 |
Scopus Subject Areas
- Health Informatics
- Computer Science Applications
- Information Systems
- Artificial Intelligence
User-Defined Keywords
- Clinical time series
- Electronic health records
- Missing data imputation