TY - JOUR
T1 - Adaptive Integration of Categorical and Multi-relational Ontologies with EHR Data for Medical Concept Embedding
AU - Cheong, Chin Wang
AU - Yin, Kejing
AU - Cheung, William K.
AU - Fung, Benjamin C.M.
AU - Poon, Jonathan
N1 - This research is partially supported by General Research Fund 12202117 and 12201219 from the Research Grants Council of Hong Kong.
Publisher Copyright:
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2023/11/14
Y1 - 2023/11/14
N2 - Representation learning has been applied to Electronic Health Records (EHR) for medical concept embedding and the downstream predictive analytics tasks with promising results. Medical ontologies can also be integrated to guide the learning so the embedding space can better align with existing medical knowledge. Yet, properly carrying out the integration is non-Trivial. Medical concepts that are similar according to a medical ontology may not be necessarily close in the embedding space learned from the EHR data, as medical ontologies organize medical concepts for their own specific objectives. Any integration methodology without considering the underlying inconsistency will result in sub-optimal medical concept embedding and, in turn, degrade the performance of the downstream tasks. In this article, we propose a novel representation learning framework called ADORE (ADaptive Ontological REpresentations) that allows the medical ontologies to adapt their structures for more robust integrating with the EHR data. ADORE first learns multiple embeddings for each category in the ontology via an attention mechanism. At the same time, it supports an adaptive integration of categorical and multi-relational ontologies in the embedding space using a category-Aware graph attention network. We evaluate the performance of ADORE on a number of predictive analytics tasks using two EHR datasets. Our experimental results show that the medical concept embeddings obtained by ADORE can outperform the state-of-The-Art methods for all the tasks. More importantly, it can result in clinically meaningful sub-categorization of the existing ontological categories and yield attention values that can further enhance the model interpretability.
AB - Representation learning has been applied to Electronic Health Records (EHR) for medical concept embedding and the downstream predictive analytics tasks with promising results. Medical ontologies can also be integrated to guide the learning so the embedding space can better align with existing medical knowledge. Yet, properly carrying out the integration is non-Trivial. Medical concepts that are similar according to a medical ontology may not be necessarily close in the embedding space learned from the EHR data, as medical ontologies organize medical concepts for their own specific objectives. Any integration methodology without considering the underlying inconsistency will result in sub-optimal medical concept embedding and, in turn, degrade the performance of the downstream tasks. In this article, we propose a novel representation learning framework called ADORE (ADaptive Ontological REpresentations) that allows the medical ontologies to adapt their structures for more robust integrating with the EHR data. ADORE first learns multiple embeddings for each category in the ontology via an attention mechanism. At the same time, it supports an adaptive integration of categorical and multi-relational ontologies in the embedding space using a category-Aware graph attention network. We evaluate the performance of ADORE on a number of predictive analytics tasks using two EHR datasets. Our experimental results show that the medical concept embeddings obtained by ADORE can outperform the state-of-The-Art methods for all the tasks. More importantly, it can result in clinically meaningful sub-categorization of the existing ontological categories and yield attention values that can further enhance the model interpretability.
KW - Additional Key Words and PhrasesElectronic health record
KW - data mining with ontologies
KW - predictive data analytics
KW - representation learning
UR - http://www.scopus.com/inward/record.url?scp=85180192344&partnerID=8YFLogxK
U2 - 10.1145/3625224
DO - 10.1145/3625224
M3 - Journal article
AN - SCOPUS:85180192344
SN - 2157-6904
VL - 14
JO - ACM Transactions on Intelligent Systems and Technology
JF - ACM Transactions on Intelligent Systems and Technology
IS - 6
M1 - 111
ER -