Prof. Dr. h.c. mult. Hasso Plattner

Master project: Global Medical Knowledge

General Information

Teaching Staff: Milena Kraus, Cindy PerscheidDr. Matthieu-P. Schapranow, Dr. Matthias Uflacker

Project Kickoff Location: HPI Campus II, August-Bebel-Str. 88, 2nd floor

Contact: Milena Kraus


Selecting the best treatment option for patients suffering from rare diseases, e.g. autoimmune diseases, is still a challenging task for clinicians. Discovery of scientific innovations lacks instant access to patient data, e.g. to test research hypotheses. Current medical knowledge about rare diseases is young and fragmented, while biomarkers for meaningful decision making are not yet verified.

The aim of precision medicine is to incorporate all available data, e.g. from other patients, historic cases or biobanks, to derive a meaningful decision based on statistical analyses. Medical doctors should be able to access all details relevant for the treatment of a current case, e.g. from similar historic and worldwide patient cases, in a way that guarantees the privacy of each individual. Researchers should be able to test their hypotheses on linked data sources to improve scientific findings. For the first time, the medical documentation would be used to improve decision making for a current patient and not only for accounting and billing purposes. However, it remains unclear how to identify, link, and explore relevant data from distributed data sources, e.g. across hospitals, countries, and continents, in a timely manner. 


Let us assume a cooperation project between two medical centers that requires the exchange of pseudonymized data for a subset of patients suffering from a specific cancer disease to find the optimal treatment decision. Today, medical data is stored in distributed, heterogenous databases of each of the medical centers. We incorporate latest in-memory database technology to combine data coming from these individual systems enabling precision medicine. Therefore we initially need to define a hierarchy of distributed database systems and furthermore define how to connect them. The master database node needs to have access to specific subsets of data of individual partners. It distributes incoming queries to the corresponding databases for further processing and merges individual results from sub databases to form the complete response. In such a setup, subsets of private database content can be sharable between project partners, while full control about the complete data set remains local. As a result, all cooperation partners are able to perform analyses on the shared data set only. This approach incorporates no changes of the individual database landscape to enable the exchange of data.

Building on our long-lasting experience in applying in-memory technology to selected enterprise challenges, we also focus on processing and analyzing of scientific data sets in real-time. In particular, the applicability of in-memory technology for analysis of clinical data will be evaluated. Proof of concept prototypes will be engineered and showed to potential end users in the course of this project. In the course of the project, we can build on the existing federated in-memory database for life sciences "Analyze Genomes".


Project Partners

The project team will have frequently contact with experts of our cooperation partners SAP AG, Walldorf and Charité — Universitätsmedizin, Berlin.


The project team will work on latest server hardware, in-memory, and multi-core technology provided by the “Enterprise Application Architecture Laboratory” at our group and HPI’s “Future SOC Lab”. The laboratory builds the foundation for HPI’s in-memory technology activities. Due to our cooperations with hardware and software vendors, we are able to access high-end hard- and software before it is available for the public market. For example, SAP’s in-memory database “SAP HANA”, which is optimized for enterprise data management, will be used as technology foundation.