FEMH Logo

MEDFuse

Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models

🏆 CIKM'24 | ACM International Conference on Information and Knowledge Management
1National Yang Ming Chiao Tung University, 2University of Michigan,
3Shanghai University of Finance and Economics, 4Stevens Institute of Technology,
5Massachusetts Institute of Technology, 6Far Eastern Memorial Hospital
*Equal Contribution

Abstract

Electronic Health Records (EHRs) are multimodal by nature, consisting of structured tabular features like lab tests and unstructured clinical notes. In real-life clinical practice, doctors use complementary multimodal EHR data sources to get a clearer picture of patients' health and support clinical decision-making. However, most EHR predictive models do not reflect these procedures, as they either focus on a single modality or overlook the inter-modality interactions and redundancy.


In this work, we propose MEDFuse, a Multimodal EHR Data Fusion framework that incorporates masked lab-test modeling and large language models (LLMs) to effectively integrate structured and unstructured medical data. MEDFuse leverages multimodal embeddings extracted from two sources: LLMs fine-tuned on free clinical text and masked tabular transformers trained on structured lab test results.


We design a disentangled transformer module, optimized by a mutual information loss to decouple modality-specific and modality-shared information and extract useful joint representation from the noise and redundancy present in clinical notes. Through comprehensive validation on the public MIMIC-III dataset and the in-house FEMH dataset, MEDFuse demonstrates great potential in advancing clinical predictions, achieving over 90% F1 score in the 10-disease multi-label classification task.

Methodology

A comprehensive framework that effectively combines structured lab tests and unstructured clinical notes

🧬

Multimodal Embedding Extraction

Fine-tuned LLMs process unstructured clinical notes while masked lab-test modeling handles structured tabular data, creating rich semantic representations for each modality.

🔬

Masked Lab-Test Modeling (MLTM)

Extends Masked Autoencoders framework to reconstruct masked lab test components, using asymmetric encoder-decoder architecture to extract meaningful representations from structured data.

🤖

Disentangled Transformer

Optimized with mutual information loss to separate modality-specific and modality-shared information, extracting useful joint representations while reducing noise and redundancy.

Performance Results

MEDFuse significantly outperforms baseline approaches across all evaluation metrics

95.35%
Training Accuracy
MIMIC-III Dataset
92.16%
F1 Macro Score
Best performance achieved
93.75%
Precision
Training performance
92.17%
Recall
Training performance
1.49%
Improvement
Over Medical-Llama3-8B
93.11%
FEMH Accuracy
Real-world validation

Experimental Validation

Comprehensive evaluation on both public and real-world clinical datasets

MIMIC-III Dataset

Publicly available critical care database used for benchmarking medical AI systems. Focused on top 10 most prevalent conditions including hypertension, cardiac arrhythmias, and diabetes.

10
Diseases
Multi-label
Classification

FEMH Dataset

Real-world EHR data from Far Eastern Memorial Hospital (2017-2021) including clinical notes, lab results, and comprehensive patient records.

1.42M
Clinical Notes
387K
Lab Results

Ablation Study

Each component contributes significantly to the overall model performance

-4.81%
Without MLTM & LABTEXT
Precision drop in training
-29.76%
Without MLTM & TEXT
Most significant drop
-17.14%
Without TEXT Only
Clinical notes importance
-0.46%
Without Disentangled Transformer
Fusion module contribution

Citation

@article{phan2024medfuse,
  title={MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models},
  author={Phan, Thao Minh Nguyen and Dao, Cong-Tinh and Wu, Chenwei and Wang, Jian-Zhe and Liu, Shun and Ding, Jun-En and Restrepo, David and Liu, Feng and Hung, Fang-Ming and Peng, Wen-Chih},
  journal={Proceedings of the 33rd ACM International Conference on Information and Knowledge Management},
  year={2024},
  publisher={ACM},
  doi={10.1145/3627673.3679962}
}