Prof. Dr. h.c. Hasso Plattner

Master Thesis Topic Areas

Please find our list of available master's thesis topics below. Should you be interested in any of those topics please feel free to contact the responsible research assistant for further information.

Tracing and Sampling Memory Accesses and the Conflict between Accuracy and Performance

High-capacity NVRAM will soon enter the storage pyramid between DRAM and SSDs. It allows for cheaper main memory, but will first be slower than DRAM. We expect data structures to be placed either on DRAM or NVRAM, depending on how they are used and with the goal of minimizing the impact of NVRAM’s higher latency. In our research group, we developed a system that automatically migrates data between DRAM and NVRAM. To do so efficiently, we need to understand how data is accessed. This includes the frequency and recency of accesses as well as their type, such as sequential versus random accesses.

Many approaches exist to trace memory accesses during runtime. These vary in their accuracy and in the overhead imposed on the execution. For instance, breaking the program on every load and store can be done to capture all memory accesses, but comes with a runtime cost that is prohibitive for live applications. Various other approaches exist that use hardware counters, modifications to the page management, and code hot patching.

The goal of this work is to (1) compare and evaluate different approaches and (2) build a library that unifies different approaches behind a common frontend.

Contact: Markus Dreseler

Data Management for Non-Volatile Memories

Storage Class Memories (SCM) are a new class of byte-addressable persisted storage media that blur the line between memory & storage due to their memory-like (~100ns) latency performance. They are expected to lead to new revolutionary programming paradigms that give memory-like byte level access to non-volatile storage. On the memory side, sharing data across processes and ensuring consistent address spaces across server reboots become important issues to be addressed. On the storage side, atomicity of updates, controlling the visibility of in-flight updates, versioning & failure/disaster recovery become key data management challenges to be addressed.

  • Data Structures for In-Memory Column Stores using Non-Volatile Memories
    The goal of this master's thesis will be to investigate the applications of SCM in the context of in-memory column stores. How can in-memory databases profit from large amounts of SCM and what kind of data structures are needed to address the scale and possible distribution of data in such systems, especially in the context of transaction processing, logging and recovery. David Schwalb
  • Distributed In-Memory Column Stores using Non-Volatile Memories
    Distributed database systems leveraging fast interconnects and keeping all data in DRAM scale well, but as memory is volatile, such systems typically achieve durability by replicating data across multiple machines. This thesis will investigate the potential of distributed systems using non-volatile memories as well as how concepts and data structures can be adapted to exploit the durability of SCM. David Schwalb

Indices for In-Memory Databases

Indices are the only way to achieve competitive throughput for transactional applications. Since modern databases keep all data in main memory, the focus of indices has shifted. A complete scan of the raw data is not prohibitively slow anymore, however, indices are crucial to achieve throughput. The presented topics aim to quantify the impact of indices, as well as the development of efficient maintenance strategies for in-memory indices.

  • Composite Indices
    Scope of this topic is the implementation and evaluation of composite indices (Multi Column Indices) in our Open Source In-Memory Database Hyrise. Martin Faust
  • Primary Key Clustered Indices
    This topic aims to evaluate the costs of a clustered index (e.g. sorting the relation by its index key) for main memory databases with a main/delta architecture. Martin Faust


Hyrise-R is a lazy master replication extension for the in-memory database Hyrise. Distributed databases are especially important today, as the number of replicated and/or partitioned databases is increasing. We offer to investigate distributed database concepts, e.g., availability, scalability, capacity, in the scope of our research database Hyrise and beyond.

Contact: Stefan Klauck

Advanced Debugging Tools

Many sophisticated tools exist for understanding and finding bugs in object-oriented programs. For database applications, however, the tool support barely extends beyond simple debuggers.

  • Dynamic Slicing in Stored Procedures
    Slicing is a technique for identifying logical relations between statements. The scope of this topic is to develop and evaluate an analysis for stored procedures that can determine logical dependencies between instructions and use runtime data to explain how and why data was changed.
    Contact: Arian Treffer

Natural Language Processing and Text Mining

The current data deluge demands fast and real-time processing of large datasets to support various applications, also for textual data, such as scientific publications, Web pages or messages in the social media. Natural language processing (NLP) is the field of automatically processing textual documents and includes a variety of computational linguistic tasks, such as tokenization, part-of-speech tagging, chunking and syntactic parsing, but also semantic tasks such as named-entity recognition, relation extraction and semantic role labeling. Processing and semantically annotating large textual collection is a time-consuming task, which requires integration of various tools. In-memory database (IMDB) technology comes as an alternative given its ability to process large document collections quickly in real time. We utilize the IMDB technology to develop applications for a variety of NLP tasks, such as question answering, text summarization, information retrieval, information extraction and machine translation.

Contact: Dr. Mariana Neves