Martin Boissier

PhD Candidate

Email:	martin.boissier(at)hpi.de
Address:	August-Bebel-Str. 88, 14482 Potsdam
Room:	V-2.05
Links:	DBLP - personal website

Research Area: Autonomous Data Management

Research

Main Memory Footprint Reduction of In-Memory Database Systems

Database systems that keep their data primarily in main memory provide high query performance but also incur high costs. We have analyzed various real-world enterprise systems and their workload and data characteristics. We found that the main memory footprint can be efficiently reduced by (i) data encoding and (ii) tiering without degrading performance significantly.
To encode and compress a database instance, we use learned cost models to predict runtimes of various data encodings. We use linear programming models to determine optimal encoding configurations within a given memory budget. For the applicability in real-world scenarios, the models incorporate robustness measures that mitigate unexpected performance degradations. To efficiently tier data to secondary storage, we extended the hybrid data layout of the first version of Hyrise and evict infrequently accessed columns in a row-major format.

Selected Publications

Hurdelhey, B., Weisgut, M., Boissier, M.: Workload-Driven Data Placement for Tierless In-Memory Database Systems. Datenbanksysteme für Business, Technologie und Web, BTW. pp. 47–70. Gesellschaft für Informatik e.V (2023).

[ Abstract ] [ BibTeX ] [ Download ]

Richly, K., Schlosser, R., Boissier, M.: Budget-Conscious Fine-Grained Configuration Optimization for Spatio-Temporal Applications. Proceedings of the VLDB Endowment. pp. 4079–4092 (2022).

[ Abstract ] [ BibTeX ] [ Download ]

Boissier, M.: Robust and Budget-Constrained Encoding Configurations for In-Memory Database Systems. Proceedings of the VLDB Endowment. pp. 780–793 (2022).

[ Abstract ] [ BibTeX ] [ Download ]

Heinzl, L., Hurdelhey, B., Boissier, M., Perscheid, M., Plattner, H.: Evaluating Lightweight Integer Compression Algorithms in Column-Oriented In-Memory DBMS. International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, ADMS@VLDB (2021).

[ Abstract ] [ BibTeX ] [ Download ]

Kossmann, J., Boissier, M., Dubrawski, A., Heseding, F., Mandel, C., Pigorsch, U., Schneider, M., Schniese, T., Sobhani, M., Tsayun, P., Wille, K., Perscheid, M., Uflacker, M., Plattner, H.: A Cockpit for the Development and Evaluation of Autonomous Database Systems. 37th IEEE International Conference on Data Engineering, ICDE. pp. 2685–2688 (2021).

[ Abstract ] [ BibTeX ] [ Download ]

@inproceedings{kossmann2021cockpit,
  abstract = {Databases are highly optimized complex systems with a multitude of configuration options. Especially in cloud scenarios with thousands of database deployments, determining optimized database configurations in an automated fashion is of increasing importance for database providers. At the same time, due to increased system complexity, it becomes more challenging to identify well-performing configurations. Therefore, research interest in autonomous or self-driving database systems has increased enormously in recent years. Such systems promise both performance improvements and cost reductions. In the literature, various fully or partially autonomous optimization mechanisms exist that optimize single aspects, e.g., index selection. However, database administrators and developers often distrust autonomous approaches, and there is a lack of practical experimentation opportunities that could create a better understanding. Moreover, the interplay of different autonomous mechanisms under complex workloads remains an open question. The presented cockpit enables an interactive assessment of the impact of autonomous components for database systems by comparing (autonomous) systems with different configurations side by side. Thereby, the cockpit enables users to build trust in autonomous solutions by experimenting with such technologies and observing their effects in practice.},
  author = {Kossmann, Jan and Boissier, Martin and Dubrawski, Alexander and Heseding, Fabian and Mandel, Caterina and Pigorsch, Udo and Schneider, Max and Schniese, Til and Sobhani, Mona and Tsayun, Petr and Wille, Katharina and Perscheid, Michael and Uflacker, Matthias and Plattner, Hasso},
  booktitle = {37th IEEE International Conference on Data Engineering, ICDE},
  keywords = {in-memory_database myown database self-managing autonomous mboissierselected self-driving adm hyrise},
  pages = {2685-2688},
  title = {A Cockpit for the Development and Evaluation of Autonomous Database Systems},
  year = 2021
}

Richly, K., Schlosser, R., Boissier, M.: Joint Index, Sorting, and Compression Optimization for Memory-Efficient Spatio-Temporal Data Management. 37th IEEE International Conference on Data Engineering (ICDE). pp. 1901–1906 (2021).

[ Abstract ] [ BibTeX ] [ Download ]

Dreseler, M., Boissier, M., Rabl, T., Uflacker, M.: Quantifying TPC-H Choke Points and Their Optimizations. Proceedings of the VLDB Endowment. pp. 1206–1220 (2020).

[ Abstract ] [ BibTeX ] [ Download ]

Boissier, M., Jendruk, M.: Workload-Driven and Robust Selection of Compression Schemes for Column Stores. 22nd International Conference on Extending Database Technology, EDBT. pp. 674–677 (2019).

[ Abstract ] [ BibTeX ] [ Download ]

Schlosser, R., Kossmann, J., Boissier, M.: Efficient Scalable Multi-Attribute Index Selection Using Recursive Strategies. 35th IEEE International Conference on Data Engineering, ICDE. pp. 1238–1249. IEEE (2019).

[ Abstract ] [ BibTeX ] [ Download ]

10.

Dreseler, M., Kossmann, J., Boissier, M., Klauck, S., Uflacker, M., Plattner, H.: Hyrise Re-engineered: An Extensible Database System for Research in Relational In-Memory Data Management. 22nd International Conference on Extending Database Technology (EDBT). pp. 313–324 (2019).

[ Abstract ] [ BibTeX ] [ Download ]

@inproceedings{dreseler2018,
  abstract = {Research in data management profits when the performance evaluation is based not only on individual components in isolation, but uses an actual DBMS end-to-end. Facilitating the integration and benchmarking of new concepts within a DBMS requires a simple setup process, well-documented code, and the possibility to execute both standard and custom benchmarks without tedious preparation. Fulfilling these requirements also makes it easy to reproduce the results later on. The relational open-source database Hyrise (VLDB, 2010) was presented to make the case for hybrid row- and column-format data storage. Since then, it has evolved from being a single- purpose research DBMS towards becoming a platform for various projects, including research in the areas of indexing, data partitioning, and non-volatile memory. With a growing diversity of topics, we have found that the original code base grew to a point where new experimentation became unnecessarily difficult. Over the last two years, we have re-written Hyrise from scratch and built an extensible multi-purpose research DBMS that can serve as an easy-to-extend platform for a variety of experiments and prototyping in database research. In this paper, we discuss how our learnings from the previous version of Hyrise have influenced our re-write. We describe the new architecture of Hyrise and highlight the main components. Afterwards, we show how our extensible plugin architecture facilitates research on diverse DBMS-related aspects without compromising the architectural tidiness of the code. In a first performance evaluation, we show that the execution time of most TPC-H queries is competitive to that of other research databases.},
  author = {Dreseler, Markus and Kossmann, Jan and Boissier, Martin and Klauck, Stefan and Uflacker, Matthias and Plattner, Hasso},
  booktitle = {22nd International Conference on Extending Database Technology (EDBT)},
  keywords = {myown mboissierselected adm hyrise},
  month = 3,
  pages = {313-324},
  title = {Hyrise Re-engineered: An Extensible Database System for Research in Relational In-Memory Data Management},
  year = 2019
}

11.

Ruhrländer, R.P., Boissier, M., Uflacker, M.: Improving Box Office Result Predictions for Movies Using Consumer-Centric Models. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD. pp. 655–664 (2018).

[ Abstract ] [ BibTeX ] [ Download ]

12.

Schlosser, R., Boissier, M.: Dynamic Pricing under Competition on Online Marketplaces: A Data-Driven Approach. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD. pp. 705–714 (2018).

[ Abstract ] [ BibTeX ] [ Download ]

13.

Boissier, M., Schlosser, R., Uflacker, M.: Hybrid Data Layouts for Tiered HTAP Databases with Pareto-Optimal Data Placements. 34th IEEE International Conference on Data Engineering, ICDE. pp. 209–220 (2018).

[ Abstract ] [ BibTeX ] [ Download ]

Teaching

Lectures and Seminars:

Supervised Master Theses:

"Workload-Driven Smooth Index and Filter Selection for In-Memory Database Scan Acceleration" (November 2022)
"Cost-aware Filtering in Query Processing on Serverless Cloud Infrastructure" (October 2022)
"Automatic Tiering in Hyrise" (September 2022)
"Automatic Clustering in Hyrise" (October 2020)
"Learned Cost Models for Query Optimization" (March 2019)
"Improving Cardinality Estimation and Access Avoidance in Hyrise" (November 2018)
"Data-Driven Ordering and Dynamic Pricing Competition on Online Marketplaces" (May 2018)
"Probabilistic Data Structures for In-Memory Databases" (May 2018)
"Maintainable and Self-Adapting Column Compression Schemes for HTAP Databases" (April 2018)
"Optimizing Database Scan Performance through Access Avoidance in Chunk-Based Databases using Multi-Dimensional Filters" (August 2017)
"Predicting movie success before release – Using individualized econometric models to predict box office performance." (January 2017)
"Workload-Aware Partitioning and Query Pruning for Mixed Workloads on In-Memory Databases" (January 2016)

Martin Boissier

PhD Candidate

Research

Main Memory Footprint Reduction of In-Memory Database Systems

Selected Publications

Teaching

News

22.09.2023 | Trends and Concepts in the Softwareindustry Seminar offered in WiSe 2023/2024

22.05.2023 | Christopher Hagedorn Successfully Defended His PhD Thesis

03.03.2023 | Last Trends and Concepts course of Prof. Hasso Plattner

01.03.2023 | Jan Kossmann Successfully Defended His PhD Thesis

26.02.2023 | Paper on Data Tiering in Hyrise Published in BTW Proceedings

24.02.2023 | Paper on EPIC Research Group Published in SIGMOD Record

30.11.2022 | Paper on Database Optimizations for Spatio-Temporal Data published in PVLDB

04.10.2022 | Günter Hesse Successfully Defended His PhD Thesis

08.07.2022 | Successful PhD Defense by Markus Dreseler

Literature

Contact