For relational databases, new features are quite rare. SQL has been largely unchanged for years, so all that users care about is more performance for less cost. Typically, standardized benchmarks are used to compare the throughput of two competing database systems or that of an old and a new version of a database. One of these benchmarks is the TPC-DS benchmark, which simulates queries as seen in decision support systems. Compared to the TPC-H benchmark, which we already support, the TPC-DS poses more challenges as queries are more complex and the input data is skewed.
In this project, we will take the TPC-DS benchmark as a yardstick for improving our own database, Hyrise. The focus will be (1) on improving the scalability of the system, i.e., using additional CPU cores as efficiently as possible and (2) on optimizing the query plans so that more efficient execution paths are chosen.
As we already have a benchmark framework in place, it will be a matter of days before we can look at first performance numbers. From there, we can track our improvements and will have measurable successes early in the project.
We will not perform any “throw-away work”, but aim for results that can be integrated into the main code base and will improve the overall project. After this project, there will be opportunities to dive deeper into identified issues as part of Master’s theses.