Our team is giving a series of lectures and seminars with a focus on enterprise systems design and in-memory data management. Strong links to the industry ensure a close connection between theory and its implementation in the real world.

If you are having questions regarding one of our publications, please contact the authors.

Master's Project: Extend Your Own Database

General Information

Teaching staff: Dr. Rainer Schlosser, Dr. Michael Perscheid, Marcel Weisgut
Master’s programs: ITSE, DE
Time: tbd
First Meeting: tba
Location: Zoom and/or V 2.16
Content:
- Group work
- Programming project
- Final presentation (mid of February)
- Research report (due: 31st of March)
Extent: 8 SWS / 12 ECTS

Description

Relational in-memory database systems achieve a high query processing performance by storing all their data in DRAM, which provides a lower data access latency than disks. However, DRAM is still relatively expensive compared to other storage technologies such as modern SSDs. Therefore, for cost-effectiveness and to avoid potential DRAM capacity limitations, we may want to store some parts of the data on secondary storage devices, resulting in larger-than-memory database systems. Two common approaches for implementing larger-than-memory databases are either having a buffer manager or using memory-mapped file I/O, e.g., via the OS-provided mmap command.

For small and mid-size data sets, the performance of Hyrise is competitive with that of comparable systems such as MonetDB, DuckDB, HyPer, and Umbra. Now we want to move toward processing terabytes of data. In this project, we will extend our database system Hyrise from a pure main memory to a larger-than-memory database system using the memory-mapped file I/O approach. After you have been introduced to the most important components of Hyrise by your supervisors and have familiarized yourself with the codebase, we will first focus on implementing a mechanism to persist table data on SSDs and load the stored data into the main memory efficiently. Second, we will evaluate different libraries for memory-mapped file I/O, including their page fault handling, to identify a particularly well-suited library for the targeted database workloads.

We aim for results that can be integrated into the main code base and push forward the open-source Hyrise project. After this project, there will be research and engineering opportunities to dive deeper into identified issues in the form of student assistantships, master’s theses, and Ph.D. positions.

Learning Goals

Through successful completion of this project, you will:

Improve your programming and teamwork skills
Learn to familiarize yourself with and work on an existing large software project
Learn to identify and eliminate performance bottlenecks
Learn to perform experimental evaluations
Deepen your database and memory management knowledge
Improve your research methodology and academic writing

Prerequisits

Prior understanding of the fundamentals of databases (e.g., from the database systems lecture or the Develop your own Database seminar) is expected as well as knowledge of C++.

Initial References

Radu Stoica and Anastasia Ailamaki: Enabling Efficient OS Paging for Main-Memory OLTP Databases. DaMoN 2013
Lin Ma et al.: Larger-than-Memory Data Management on Modern Storage Hardware for In-Memory OLTP Database Systems. DaMoN 2016
Andrew Crotty et al.: Are You Sure You Want to Use MMAP in Your Database Management System? CIDR 2022
Ivy Peng et al.: UMap: Enabling Application-driven Optimizations for Page Management. MCHPC 2019
Anastasios Papagiannis et al., Memory-Mapped I/O on Steroids. EuroSys 2021

Contact

For questions and details visit us at the Villa, 2^nd floor on Campus II, or send us an email:

Marcel Weisgut

News

22.09.2023 | Trends and Concepts in the Softwareindustry Seminar offered in WiSe 2023/2024

Trends and Concepts in the Softwareindustry Seminar offered in WiSe 2023/2024 > Zum Artikel

22.05.2023 | Christopher Hagedorn Successfully Defended His PhD Thesis

Christopher Hagedorn Successfully Defended His PhD Thesis > Zum Artikel

03.03.2023 | Last Trends and Concepts course of Prof. Hasso Plattner

After more than 20 years of teaching, our founder and benefactor Prof. Hasso Plattner visited the HPI this week for his … > Zum Artikel

01.03.2023 | Jan Kossmann Successfully Defended His PhD Thesis

Last week, Jan Kossmann another PhD student of our EPIC group successfully defended his thesis on the topic of … > Zum Artikel

26.02.2023 | Paper on Data Tiering in Hyrise Published in BTW Proceedings

Our latest paper on data tiering in Hyrise "Workload-Driven Data Placement for Tierless In-Memory Database Systems" by … > Zum Artikel

24.02.2023 | Paper on EPIC Research Group Published in SIGMOD Record

Our report “Enterprise Platform and Integration Concepts Research at HPI” has been published in the December issue of … > Zum Artikel

30.11.2022 | Paper on Database Optimizations for Spatio-Temporal Data published in PVLDB

Our paper “Robust and Budget-Constrained Encoding Configurations for In-Memory Database Systems” has been published in … > Zum Artikel

04.10.2022 | Günter Hesse Successfully Defended His PhD Thesis

Last week, Günter Hesse another PhD student of our EPIC group successfully defended his thesis on the topic of "A … > Zum Artikel

08.07.2022 | Successful PhD Defense by Markus Dreseler

Markus Dreseler has successfully defended his PhD thesis on Automatic Tiering for In-Memory Database Systems. > Zum Artikel

Literature

"A Course in In-Memory Data Management" by Prof. Dr. h.c. Hasso Plattner. This book is the culmination of six years work of in-memory research. As such, it provides the technical foundation for combined transactional and analytical workloads inside one single database as well as examples of new applications that are now possible given the availability of the new technology. The book is available at Springer.

Contact

Dr. Michael Perscheid

Chair Representative

Tel.: +49 (331) 5509-566

E-Mail: michael.perscheid(at)hpi.de

Office:

Room: V-2.12

Tel.: +49 (331) 5509-560

Fax: +49 (331) 5509-579

E-Mail: office-epic(at)hpi.de

Contact Details