TY - JOUR
T1 - Heterogeneous neural metric learning for spatio-temporal modeling of infectious diseases with incomplete data
AU - Tan, Qi
AU - LIU, Yang
AU - LIU, Jiming
AU - SHI, Benyun
AU - Xia, Shang
AU - Zhou, Xiao Nong
N1 - Funding Information:
The authors would like to acknowledge the funding support from Hong Kong Research Grants Council (RGC/HKBU12201318, RGC/HKBU12201619, RGC/HKBU12202220) for the research work being presented in this article.
PY - 2021/10/11
Y1 - 2021/10/11
N2 - Infectious disease data, recording the numbers of infection cases in different locations and time, is one of the most typical categories of spatio-temporal data and plays an important role in the infectious disease control and prevention. However, due to the insufficient resources and manpower, the observations and records of infection cases are inevitably missing in some locations and time, which brings difficulties to the accurate risk assessment and timely disease control. Imputing the missing infectious disease data is challenging as the infectious disease diffusion can be potentially caused and affected by many risk factors. To address the above-mentioned challenges, a novel machine learning method, Heterogeneous Neural Metric Learning (HNML), is developed to restore the integrity of case reporting data using both the incomplete reported cases and the underlying disease-related risk factors from heterogeneous data sources. We empirically validate the effectiveness of our developed method on a representative infectious disease, malaria. We test the developed method under three common real-life data missing patterns with different levels of missing rates. By incorporating the disease-related risk factors as external resources through the proposed HNML method, we demonstrate significant accuracy improvement over the baseline and state-of-the-art inference methods for predicting unobserved malaria cases based on the incomplete reporting data. The results suggest that the disease-related risk factors can provide valuable information about the transmission patterns of infectious diseases and should be taken into account when implementing the surveillance.
AB - Infectious disease data, recording the numbers of infection cases in different locations and time, is one of the most typical categories of spatio-temporal data and plays an important role in the infectious disease control and prevention. However, due to the insufficient resources and manpower, the observations and records of infection cases are inevitably missing in some locations and time, which brings difficulties to the accurate risk assessment and timely disease control. Imputing the missing infectious disease data is challenging as the infectious disease diffusion can be potentially caused and affected by many risk factors. To address the above-mentioned challenges, a novel machine learning method, Heterogeneous Neural Metric Learning (HNML), is developed to restore the integrity of case reporting data using both the incomplete reported cases and the underlying disease-related risk factors from heterogeneous data sources. We empirically validate the effectiveness of our developed method on a representative infectious disease, malaria. We test the developed method under three common real-life data missing patterns with different levels of missing rates. By incorporating the disease-related risk factors as external resources through the proposed HNML method, we demonstrate significant accuracy improvement over the baseline and state-of-the-art inference methods for predicting unobserved malaria cases based on the incomplete reporting data. The results suggest that the disease-related risk factors can provide valuable information about the transmission patterns of infectious diseases and should be taken into account when implementing the surveillance.
KW - heterogeneous data sources
KW - heterogeneous neural metric learning (HNML)
KW - incomplete-data
KW - infectious disease
KW - kernel method
KW - metric learning
KW - Spatio-temporal modeling
UR - http://www.scopus.com/inward/record.url?scp=85097753741&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2019.12.145
DO - 10.1016/j.neucom.2019.12.145
M3 - Journal article
AN - SCOPUS:85097753741
SN - 0925-2312
VL - 458
SP - 701
EP - 713
JO - Neurocomputing
JF - Neurocomputing
ER -