Das Hasso-Plattner-Institut bietet seinen Studierenden ein gebührenfreies und praxisnahes Informatikstudium an einem international renommierten Institut. Zum Studienangebot zählen der deutschlandweit einzigartige Bachelorstudiengang IT-Systems Engineering und die fünf Masterstudiengänge Cybersecurity, Data Engineering, Digital Health, IT-Systems Engineering und Software Systems Engineering.

Unsere Forschenden am HPI profitieren von einem inspirierenden wissenschaftlichen Umfeld sowie einer kollaborativen und inklusiven Arbeitsatmosphäre. So entstehen Erkenntnisse auf hohem wissenschaftlichen Niveau, die zum Wohl der Gesellschaft beitragen. Unsere wissenschaftliche Arbeit ist in Research Cluster strukturiert. Zudem kooperieren wir in zahlreichen Forschungsprogrammen mit wissenschaftlichen Institutionen, Unternehmen und öffentlichen Einrichtungen auf nationaler und internationaler Ebene.

Das Hasso-Plattner-Institut in Potsdam ist einzigartig in der deutschen Universitätslandschaft. Unterstützt durch Stifter Hasso Plattner und durch internationale Kooperationen bis hin zum Silicon Valley wächst das Angebot des Instituts stetig weiter. Erfahrt mehr über den Stifter, die Veranstaltungen und das Studium am HPI.

Auch für Schüler und Berufstätige bietet das Hasso-Plattner-Institut attraktive Bildungsangebote. Mit openHPI betreibt es eine eigene IT-Bildungsplattform mit kostenlosen Online-Kursen. Die Schülerakademie organisiert Informatik-Camps und Veranstaltungen für Schüler. Die HPI Academy bietet Berufstätigen Bildungsangebote im Design Thinking.

Der Pressebereich des Hasso-Plattner-Instituts versorgt Sie regelmäßig mit allen aktuellen Neuigkeiten, Infos zu unseren Social-Media-Kanälen, Kontaktdaten und ausführlichem Pressematerial.

Jens Krüger

Enterprise-specific In-Memory Data Management: HYRISEc - An In-Memory Column Store Engine for OLXP

Enterprise applications are presently built on a 20-year-old data management infrastructure that was designed to meet a specific set of requirements for transactional systems. In the meantime, enterprise applications have become more sophisticated, data set sizes have increased, requirements on the freshness of data have been strengthened, and the time allotted for completing business processes has been reduced. To meet these challenges, enterprise applications have become increasingly complicated to make up for shortcomings in the data management infrastructure. These complications increase the total cost of ownership of the applications and make them harder to use. This thesis pursues the idea of designing an enterprise application-specific database engine, which is better optimized for the observed workload and data characteristics, while leveraging latest hardware trends and advances in data processing algorithms. As a result, the actual requirements, data characteristics, and as workloads from today's enterprise applications are extracted, a novel workload category called Online Mixed Workload Processing (OLXP) is defined and the enterprise application-specific database engine HYRISEc is presented. HYRISEc facilitates read-optimized in-memory data structures since today's database systems are designed for a more update intensive workload than they are actually facing. Traditional read-optimized databases use a dictionary encoded compressed column-oriented approach, especially in combination with an in-memory architecture. Inserting a tuple in such a compressed store is as complex as inserting a value in a sorted column, because the entire compression has to be rebuilt. Furthermore, traditional index structures cannot be applied efficiently as these databases are not based on a page structure.

To handle updates in a compressed storage efficiently, HYRISEc implements a technique called differential store, maintaining a small write-optimized delta partition that accumulates all updates. Periodically, this delta partition is combined with the read-optimized main partition. This merge process involves decompressing the compressed main partition, merging the delta and main partitions, and re-compressing the resulting main partition. For transactional enterprise applications it is crucial that their data is always available in a 24/7 environment and system downtimes are not allowed. As such, the merge process must be performed online and fast enough, so as not to degrade the update throughput. The update performance of such a system is limited by two factors: a) the insert rate for the write-optimized structure, and b) the speed with which the system can merge the accumulated updates back into the read-optimized partition, while keeping the system online without any downtime. The merge process becomes the main bottleneck for the system, and needs to be optimized by orders of magnitude to support fast updates. Consequently, a fast attribute merging algorithm is introduced that performs a linear-time update of the compressed main partition, and performs multi- core aware optimizations to exploit the underlying high compute and bandwidth resources of modern multi-core CPUs.

With regards to fast lookups, memory bandwidth and latency are limiting the execution speed of queries and, therefore, this scarce resource has to be used economically to maximize performance. Scanning a complete column results in the transfer of the entire column from memory to the processor and the costs depend linearly on the column length. A common approach to speed up the access to highly selective subsets of the data is to use indices which enable searches in logarithmical time. Thus, an index is introduced that leverages the proposed architecture. This includes the compressed column- oriented storage for the actual index data structures and the attribute merge algorithm for index maintenance.

To summarize, this thesis presents research results illustrating how an in-memory database engine can be implemented for OLXP by introducing data structures and algorithms to enable fast updates and lookups by leveraging the potential of a read-optimized store at the same time.

Ombudsperson

Ombudspersonen beraten als neutrale und qualifizierte Ansprechpersonen in Fragen guter wissenschaftlicher Praxis und in Verdachtsfällen wissenschaftlichen Fehlverhaltens.

Sie tragen, soweit möglich, zur lösungsorientierten Konfliktvermittlung bei.

Bei Fragen wenden Sie sich bitte an:

Prof. Dr. Tilmann Rabl

Tel.: +49 (0)331 5509-280
E-Mail: tilmann.rabl(at)hpi.de

Future SOC Lab

Das Future SOC Lab des Hasso-Plattner-Instituts bietet externen Wissenschaftlern neueste Hard- und Software-Infrastrukturen kostenfrei zu Forschungszwecken.

Informationen zum Future SOC Lab

Research Schools

Die HPI Research Schools für "Service-Oriented Systems Engineering" und "Data Science and Engineering" sind die Graduiertenschule des Hasso-Plattner-Instituts. Sie unterhalten Außenstellen in Südafrika, Israel, China und den USA.

Informationen zu den Research Schools

Digital Health Cluster

Das Digital Health Cluster (DHC) ist ein offenes Netzwerk für Wissenschaftler und Forschungseinrichtungen, die gemeinsam das Gesundheitssystem mit neuen Digital-Health-Anwendungen voranbringen und den Patienten stärken möchten.

Informationen zum Digital Health Cluster