Our team is giving a series of lectures and seminars with a focus on enterprise systems design and in-memory data management. Strong links to the industry ensure a close connection between theory and its implementation in the real world.

If you are having questions regarding one of our publications, please contact the authors.

Master's Project - Resource Allocation for Scale-Out Database Systems and Cloud Computing

General Information

Teaching staff: Stefan Halfpap, Dr. Rainer Schlosser, Dr. Michael Perscheid
Master’s programs: ITSE, DE
Extent: 8 SWS / 12 ECTS
Location: Zoom and V-1.03/a
Content:
- Group work
- Programming project
- Final presentation (July/August)
- Research report (due: September 30)

Description

Growing data volumes and the desire for analyzing this data requires multi-node data management systems, e.g., scale-out database systems. Such systems are increasingly deployed in cloud environments. With that, questions arise about efficiently distributing data and processing load within a multi-node system and assigning cloud resources to this system. Therefore, in this master’s project, we develop approaches for resource-efficient allocations in the context of scale-out database systems and cloud computing.

Allocation problems are optimization problems and omnipresent in database and enterprise systems. Many of these problems are NP-hard, i.e., if the input sizes increase, they quickly cannot be solved in a reasonable amount of time anymore – particularly when using a brute-force approach, examining all possible solution candidates. In the field of mathematical programming, we can use off-the-shelf solvers, which efficiently search for optimal solutions, to mitigate the increase of calculation time. Further, we can use the power of these solvers to build efficient heuristics.

We have previously developed a decomposition-based allocation approach using mixed-integer linear programming (a subclass of mathematical optimization) for partially replicated database clusters.

In this master’s project, we want to investigate mathematical programming approaches for adapted problems, which are characterized by other optimization goals and constraints:

For scale-out database systems, we want to improve the memory-efficiency (i.e., data reuse) when queries are distributed across multiple database nodes.
In the context of cloud computing, we want to optimize the resource utilization when placing virtual machines with allocation constraints (e.g., co-location and fault tolerance) in a cluster.

Prerequisites

There are no particular prerequisites for this project. Prior knowledge in database or distributed systems may help to understand the problem domain. We will give you an overview of the problem domain and teach you to solve optimization problems using solvers, such as Gurobi and CLPEX, via the modeling language AMPL. For the rest of the programming work, we will use Python.

Learning Goals

Through active participation in this project, you will:

Learn to solve optimization problems with integer linear programming
Apply techniques to control the problem complexity and, thus, calculation time
Understand the possibilities to scale database systems
Understand challenges for virtual machine placement
Improve your research methodology and academic writing

After this project, there will be research opportunities to dive deeper into identified issues in the form of master’s theses and PhD positions.

Resources

https://ampl.com
Halfpap and Schlosser: Workload-Driven Fragment Allocation for Partially ReplicatedDatabases Using Linear Programming. ICDE 2019
Halfpap and Schlosser: Memory-Efficient Database Fragment Allocation for Robust Load Balancing when Nodes Fail. ICDE 2021
Rabl and Jacobsen: Query Centric Partitioning and Allocation for Partially Replicated Database Systems. SIGMOD 2017
Schlosser and Halfpap: Robust and Memory-Efficient Database Fragment Allocation for Large and Uncertain Database Workloads. EDBT 2021

News

22.09.2023 | Trends and Concepts in the Softwareindustry Seminar offered in WiSe 2023/2024

Trends and Concepts in the Softwareindustry Seminar offered in WiSe 2023/2024 > Zum Artikel

22.05.2023 | Christopher Hagedorn Successfully Defended His PhD Thesis

Christopher Hagedorn Successfully Defended His PhD Thesis > Zum Artikel

03.03.2023 | Last Trends and Concepts course of Prof. Hasso Plattner

After more than 20 years of teaching, our founder and benefactor Prof. Hasso Plattner visited the HPI this week for his … > Zum Artikel

01.03.2023 | Jan Kossmann Successfully Defended His PhD Thesis

Last week, Jan Kossmann another PhD student of our EPIC group successfully defended his thesis on the topic of … > Zum Artikel

26.02.2023 | Paper on Data Tiering in Hyrise Published in BTW Proceedings

Our latest paper on data tiering in Hyrise "Workload-Driven Data Placement for Tierless In-Memory Database Systems" by … > Zum Artikel

24.02.2023 | Paper on EPIC Research Group Published in SIGMOD Record

Our report “Enterprise Platform and Integration Concepts Research at HPI” has been published in the December issue of … > Zum Artikel

30.11.2022 | Paper on Database Optimizations for Spatio-Temporal Data published in PVLDB

Our paper “Robust and Budget-Constrained Encoding Configurations for In-Memory Database Systems” has been published in … > Zum Artikel

04.10.2022 | Günter Hesse Successfully Defended His PhD Thesis

Last week, Günter Hesse another PhD student of our EPIC group successfully defended his thesis on the topic of "A … > Zum Artikel

08.07.2022 | Successful PhD Defense by Markus Dreseler

Markus Dreseler has successfully defended his PhD thesis on Automatic Tiering for In-Memory Database Systems. > Zum Artikel

Literature

"A Course in In-Memory Data Management" by Prof. Dr. h.c. Hasso Plattner. This book is the culmination of six years work of in-memory research. As such, it provides the technical foundation for combined transactional and analytical workloads inside one single database as well as examples of new applications that are now possible given the availability of the new technology. The book is available at Springer.

Contact

Dr. Michael Perscheid

Chair Representative

Tel.: +49 (331) 5509-566

E-Mail: michael.perscheid(at)hpi.de

Office:

Room: V-2.12

Tel.: +49 (331) 5509-560

Fax: +49 (331) 5509-579

E-Mail: office-epic(at)hpi.de

Contact Details