EHR2Path: Scalable Modeling of Longitudinal Patient Pathways from Multimodal Electronic Health Records
Chantal Pellegrini, Ege Özsoy, David Bani-Harouni, Matthias Keicher, Nassir Navab
Abstract
Forecasting how a patient's condition is likely to evolve, including possible deterioration, recovery, treatment needs, and care transitions, could support more proactive and personalized care, but requires modeling heterogeneous and longitudinal electronic health record (EHR) data. Yet, existing approaches typically focus on isolated prediction tasks, narrow feature spaces, or short context windows, limiting their ability to model full patient pathways. To address this gap, we introduce EHR2Path, a multimodal framework for forecasting and simulating full in-hospital patient pathways from routine EHRs. EHR2Path converts diverse clinical inputs into a unified temporal representation, enabling modeling of a substantially broader set of patient information, including radiology reports, physician notes, vital signs, medication and laboratory patterns, and dense bedside charting. To support long clinical histories and broad feature spaces, we introduce a Masked Summarization Bottleneck that compresses long-term history into compact, task-optimized summary tokens while preserving recent context, improving both performance and token efficiency. In retrospective experiments on MIMIC-IV, EHR2Path enables next-step pathway forecasting and iterative simulation of complete in-hospital trajectories, while outperforming strong baselines on directly comparable tasks. These results establish a foundation for scalable pathway-level modeling from routine EHRs supporting anticipatory clinical decision-making. Our code is available at https://github.com/ChantalMP/EHR2Path.
Full analysis loading… Code implementations, benchmark data, and reproduction guides are being assembled. Please check back shortly.
Need human evaluators for your AI research? Scale annotation with expert AI Trainers.