Growing data volumes and the desire for analyzing this data requires multi-node data management systems, e.g., scale-out database systems. Such systems are increasingly deployed in cloud environments. With that, questions arise about efficiently distributing data and processing load within a multi-node system and assigning cloud resources to this system. Therefore, in this master’s project, we develop approaches for resource-efficient allocations in the context of scale-out database systems and cloud computing.
Allocation problems are optimization problems and omnipresent in database and enterprise systems. Many of these problems are NP-hard, i.e., if the input sizes increase, they quickly cannot be solved in a reasonable amount of time anymore – particularly when using a brute-force approach, examining all possible solution candidates. In the field of mathematical programming, we can use off-the-shelf solvers, which efficiently search for optimal solutions, to mitigate the increase of calculation time. Further, we can use the power of these solvers to build efficient heuristics.
We have previously developed a decomposition-based allocation approach using mixed-integer linear programming (a subclass of mathematical optimization) for partially replicated database clusters.
In this master’s project, we want to investigate mathematical programming approaches for adapted problems, which are characterized by other optimization goals and constraints:
For scale-out database systems, we want to improve the memory-efficiency (i.e., data reuse) when queries are distributed across multiple database nodes.
In the context of cloud computing, we want to optimize the resource utilization when placing virtual machines with allocation constraints (e.g., co-location and fault tolerance) in a cluster.