Learning Inter-Modal Correspondence and Phenotypes from Multi-Modal Electronic Health Records

Kejing Yin*, Kwok Wai Cheung, Benjamin C.M. Fung, Jonathan Poon

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

4 Citations (Scopus)


Non-negative tensor factorization has been shown a practical solution to automatically discover phenotypes from the electronic health records (EHR) with minimal human supervision. Such methods generally require an input tensor describing the inter-modal interactions to be pre-established; however, the correspondence between different modalities (e.g., correspondence between medications and diagnoses) can often be missing in practice. Although heuristic methods can be applied to estimate them, they inevitably introduce errors, and leads to sub-optimal phenotype quality. This is particularly important for patients with complex health conditions (e.g., in critical care) as multiple diagnoses and medications are simultaneously present in the records. To alleviate this problem and discover phenotypes from EHR with unobserved inter-modal correspondence, we propose the collective hidden interaction tensor factorization (cHITF) to infer the correspondence between multiple modalities jointly with the phenotype discovery. We assume that the observed matrix for each modality is marginalization of the unobserved inter-modal correspondence, which are reconstructed by maximizing the likelihood of the observed matrices. Extensive experiments conducted on the real-world MIMIC-III dataset demonstrate that cHITF effectively infers clinically meaningful inter-modal correspondence, discovers phenotypes that are more clinically relevant and diverse, and achieves better predictive performance compared with a number of state-of-the-art computational phenotyping models.

Original languageEnglish
Pages (from-to)4328-4341
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number9
Early online date16 Nov 2020
Publication statusPublished - 1 Sept 2022

Scopus Subject Areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

User-Defined Keywords

  • Electronic health records
  • Multi-modal data mining
  • Computational phenotyping
  • Tensor factorization


Dive into the research topics of 'Learning Inter-Modal Correspondence and Phenotypes from Multi-Modal Electronic Health Records'. Together they form a unique fingerprint.

Cite this