Hasso-Plattner-Institut
Prof. Dr. h.c. mult. Hasso Plattner
 

20.12.2018

Paper accepted at EDBT 2019

Our systems paper "Hyrise Re-engineered: An Extensible Database System for Research in Relational In-Memory Data Management” about the recent rewrite of our research database Hyrise has been accepted for publication at the EDBT 2019 conference.

In case you are interested in a pre-print version of the paper, please contact Markus Dreseler.


Abstract:

Research in data management profits when the performance evaluation is based not only on single components in isolation, but uses an actual DBMS end-to-end. Facilitating the integration and benchmarking of new concepts within a DBMS requires a simple setup process, well-documented code, and the possibility to execute both standard and custom benchmarks without tedious preparation. Fulfilling these requirements also makes it easy to reproduce the results later on.
The relational open-source database Hyrise (VLDB, 2010) was presented to make the case for hybrid row- and column-format data storage. Since then, it has evolved from a single-purpose research DBMS towards a platform for various projects, including research in the areas of indexing, data partitioning, and non-volatile memory. With a growing diversity of topics, we have found that the original code base grew to a point where new experimentation was made unnecessarily difficult. Over the last two years, we have rewritten Hyrise from scratch, focusing on building an extensible multi-purpose research DBMS that can serve as an easy-to-extend platform for a variety of experiments and prototyping in database research.
In this paper, we discuss how our learnings from the previous version of Hyrise have influenced our rewrite. We describe the new architecture of Hyrise and highlight the main components. We then show how our extensible plugin architecture facilitates research on diverse DBMS-related aspects without compromising the architectural tidiness of the code. In a first performance evaluation, we show that the execution time of most TPC-H queries is competitive to that of other research databases.