TY - GEN
T1 - Public health surveillance with incomplete data - Spatio-temporal imputation for inferring infectious disease dynamics
AU - Tan, Qi
AU - Liu, Jiming
AU - Shi, Benyun
AU - Liu, Yang
AU - Zhou, Xiao Nong
N1 - Funding Information:
This work was supported in part by the Grants from the Research Grant Council of Hong Kong SAR under Project RGC/HKBU12202415 and in part by the National Natural Science Foundation of China under Grant 81402760.
PY - 2018/7/24
Y1 - 2018/7/24
N2 - For thousands of years, infectious diseases have been a major threat to mankind. In order to detect and prevent the epidemics early and effectively, public health surveillance is very important in the disease control efforts. Planning of public health surveillance requires availability of public health data in a certain area, from which the spatial and temporal disease transmission patterns in the data can be discovered and used to set the surveillance sentinels. However, the data missing is often unavoidable in various kinds of epidemic scenarios. Moreover, different kinds of data missing, such as spatial missing, temporal missing, and random missing, make the modeling quite challenging. Existing methods for missing data completion modeled the spatio-temporal correlations only on the target variable but ignored the underlying risk factors, which have been shown playing an important role in making inferences on the target variable (i.e., the number of infected cases). Moreover, the strengths of spatio-temporal correlations, which have been assumed fixed in the existing methods, could dynamically change along with the changes in underlying risk factors. To take the underlying risk factors into consideration for inferring the disease dynamics with incomplete data, in this paper, we propose a novel method called spatio-temporal imputation via kernel-based learning (STI-KL). Specifically, we infer the missing data by determining the location-specific correlations of dynamically changing disease-related risk factors. The spatio-temporal correlations of the target variable are inferred from various disease-related risk factors and geographic distances. To integrate the spatio-temporal learning processes, we develop an alternating algorithm to update the model parameters. Extensive experiments in real-world malaria surveillance and on a systematically designed synthetic dataset validate the effectiveness of the proposed method.
AB - For thousands of years, infectious diseases have been a major threat to mankind. In order to detect and prevent the epidemics early and effectively, public health surveillance is very important in the disease control efforts. Planning of public health surveillance requires availability of public health data in a certain area, from which the spatial and temporal disease transmission patterns in the data can be discovered and used to set the surveillance sentinels. However, the data missing is often unavoidable in various kinds of epidemic scenarios. Moreover, different kinds of data missing, such as spatial missing, temporal missing, and random missing, make the modeling quite challenging. Existing methods for missing data completion modeled the spatio-temporal correlations only on the target variable but ignored the underlying risk factors, which have been shown playing an important role in making inferences on the target variable (i.e., the number of infected cases). Moreover, the strengths of spatio-temporal correlations, which have been assumed fixed in the existing methods, could dynamically change along with the changes in underlying risk factors. To take the underlying risk factors into consideration for inferring the disease dynamics with incomplete data, in this paper, we propose a novel method called spatio-temporal imputation via kernel-based learning (STI-KL). Specifically, we infer the missing data by determining the location-specific correlations of dynamically changing disease-related risk factors. The spatio-temporal correlations of the target variable are inferred from various disease-related risk factors and geographic distances. To integrate the spatio-temporal learning processes, we develop an alternating algorithm to update the model parameters. Extensive experiments in real-world malaria surveillance and on a systematically designed synthetic dataset validate the effectiveness of the proposed method.
KW - Incomplete data
KW - Infectious disease spread
KW - Spatio temporal process
UR - http://www.scopus.com/inward/record.url?scp=85051144543&partnerID=8YFLogxK
U2 - 10.1109/ICHI.2018.00036
DO - 10.1109/ICHI.2018.00036
M3 - Conference proceeding
AN - SCOPUS:85051144543
T3 - Proceedings - 2018 IEEE International Conference on Healthcare Informatics, ICHI 2018
SP - 255
EP - 264
BT - Proceedings - 2018 IEEE International Conference on Healthcare Informatics, ICHI 2018
PB - IEEE
T2 - 6th IEEE International Conference on Healthcare Informatics, ICHI 2018
Y2 - 4 June 2018 through 7 June 2018
ER -