EHR-KnowGen: Knowledge-enhanced multimodal learning for disease diagnosis generation

Shuai Niu, Jing Ma, Liang Bai, Zhihua Wang, Li Guo, Xian Yang*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

3 Citations (Scopus)


Electronic health records (EHRs) contain diverse patient information, including medical notes, clinical events, and laboratory test results. Integrating this multimodal data can improve disease diagnoses using deep learning models. However, effectively combining different modalities for diagnosis remains challenging. Previous approaches, such as attention mechanisms and contrastive learning, have attempted to address this but do not fully integrate the modalities into a unified feature space. This paper presents EHR-KnowGen, a multimodal learning model enhanced with external domain knowledge, for improved disease diagnosis generation from diverse patient information in EHRs. Unlike previous approaches, our model integrates different modalities into a unified feature space with soft prompts learning and leverages large language models (LLMs) to generate disease diagnoses. By incorporating external domain knowledge from different levels of granularity, we enhance the extraction and fusion of multimodal information, resulting in more accurate diagnosis generation. Experimental results on real-world EHR datasets demonstrate the superiority of our generative model over comparative methods, providing explainable evidence to enhance the understanding of diagnosis results.

Original languageEnglish
Article number102069
JournalInformation Fusion
Early online date13 Oct 2023
Publication statusPublished - Feb 2024

Scopus Subject Areas

  • Software
  • Signal Processing
  • Information Systems
  • Hardware and Architecture

User-Defined Keywords

  • Disease diagnosis
  • Generative large language model
  • Knowledge enhancement
  • Multimodal electronic health records
  • Multimodal learning


Dive into the research topics of 'EHR-KnowGen: Knowledge-enhanced multimodal learning for disease diagnosis generation'. Together they form a unique fingerprint.

Cite this