Participants
Master Project Students: Lucas Reisener, Philipp Schimmelfennig
Supervisors: Stefan Kalabakov
Project description
Electronic Health Records (EHR) offer unprecedented opportunities for applying machine learning (ML) to improve patient care and clinical decision-making. However, leveraging EHR data is complicated by privacy regulations (such as GDPR and HIPAA) that restrict or prohibit sharing patient data, institutional policies, and significant heterogeneity in data formats and standards across data sources. Traditional centralized ML approaches can therefore be impractical.
This motivates the adoption of federated learning (FL) as a privacy-preserving alternative. FL enables collaborative model training across institutions without sharing raw data, but introduces new challenges, including communication overhead and vulnerability to non-identically and independently distributed (non-IID) data across clients.
The absence of widely adopted benchmarks and harmonization tools can make the evaluation of federated learning strategies on EHR data more challenging. Differences in cohort definitions, preprocessing pipelines, and experimental setups make it difficult to compare methods or reproduce findings across studies. As a result, the benefits of novel FL strategies become conflated with differences in data preparation and task definitions, which hinders the identification of best practices and slows down progress toward clinical translation.
In this project, we worked on FedEHR-Bench, a benchmarking framework designed to address these challenges and enable systematic evaluation of FL strategies on EHR data. Our pipeline supports six widely used ICU datasets, provides tools for harmonization and preprocessing, enables declarative configuration of data distributions across FL clients, and supports tools for analyzing these distributions. This allows researchers to systematically study the impact of data heterogeneity on FL performance and reproduce experimental scenarios.
Integration with Flower for FL simulation, standard ML libraries, and modern experiment tracking ensures reproducibility, extensibility, and applicability.