Hasso-Plattner-Institut
Prof. Dr. h.c. Hasso Plattner
  
 

Martin Boissier

Research Assistant, PhD Candidate

Email:martin.boissier(at)hpi.de
Phone: +49 (331) 5509 - 1330
Fax: +49 (331) 97992 - 579
Website: https://martin.boissier.de
Room: Hasso-Plattner-Villa, V 2.05

Research

Main Memory Footprint Reduction of In-Memory Database Systems

Modern relational database systems keep data resident in main memory. While this enables high  runtime performance, the required computational resources also incur a high TCO. It is thus desirable to reduce the main memory footprint while at the same time retaining the performance superiority over disk-based systems. In the spectrum of a fully DRAM-resident to a disk-resident database system, the goal is to find a configuration with the maximum runtime performance for a given memory budget. Existing approaches focus either on analytical or transactional systems. However, for OLxP workloads, such a reduction is an unsolved challenge for which existing methods are insufficient.
We propose methods for OLxP database systems, which degrade gracefully with decreasing memory budgets and adapt dynamically to changing workloads. To estimate a configuration’s impact before actually applying it, we build learned performance estimators that allow us to generate robust configurations.
The actual footprint reduction process is divided into three aspects. First, we reduce existing allocations by removing inefficient secondary indices and applying workload-driven compression configurations. Second, we use hybrid table layouts that evict infrequently accessed data to secondary storage tiers. Third, we employ auxiliary data structures that eliminate most unnecessary accesses to secondary storage. This mitigates the negative effects of tiering data to slower storage tiers.
We show that access patterns often seen in real-world systems allow reducing the footprint significantly with neglectable performance losses. For very small memory budgets, the auxiliary data structures can avoid the majority of accesses to slower storage tiers.

Selected Publications

  • Quantifying TPC-H Choke P... - Download
    Dreseler, M., Boissier, M., Rabl, T., Uflacker, M.: Quantifying TPC-H Choke Points and Their Optimizations.Proceedings of the VLDB Endowment. pp. 1206-1220 (2020).
     
  • Workload-Driven and Robus... - Download
    Boissier, M., Jendruk, M.: Workload-Driven and Robust Selection of Compression Schemes for Column Stores.22nd International Conference on Extending Database Technology (EDBT). pp. 674-677 (2019).
     
  • Efficient Scalable Multi-... - Download
    Schlosser, R., Kossmann, J., Boissier, M.: Efficient Scalable Multi-Attribute Index Selection Using Recursive Strategies.IEEE 35th International Conference on Data Engineering (ICDE 2019). pp. 1238-1249. IEEE (2019).
     
  • Hyrise Re-engineered: An ... - Download
    Dreseler, M., Kossmann, J., Boissier, M., Klauck, S., Uflacker, M., Plattner, H.: Hyrise Re-engineered: An Extensible Database System for Research in Relational In-Memory Data Management.22nd International Conference on Extending Database Technology (EDBT). pp. 313-324 (2019).
     
  • Reducing the Footprint of... - Download
    Boissier, M.: Reducing the Footprint of Main Memory HTAP Systems: Removing, Compressing, Tiering, and Ignoring Data.PhD Workshop at VLDB. CEUR-WS.org (2018).
     
  • Dealing with the Dimensio... - Download
    Schlosser, R., Boissier, M.: Dealing with the Dimensionality Curse in Dynamic Pricing Competition: Using Frequent Repricing to Compensate Imperfect Market Anticipations.Computers and Operations Research.100,26-42 (2018).
     
  • Improving Box Office Resu... - Download
    Ruhrländer, R.P., Boissier, M., Uflacker, M.: Improving Box Office Result Predictions for Movies Using Consumer-Centric Models.KDD '18 Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. pp. 655-664 (2018).
     
  • Dynamic Pricing under Com... - Download
    Schlosser, R., Boissier, M.: Dynamic Pricing under Competition on Online Marketplaces: A Data-Driven Approach.KDD '18 Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. pp. 705-714 (2018).
     
  • Hybrid Data Layouts for T... - Download
    Boissier, M., Schlosser, R., Uflacker, M.: Hybrid Data Layouts for Tiered HTAP Databases with Pareto-Optimal Data Placements.IEEE 34th International Conference on Data Engineering (ICDE 2018). pp. 209-220 (2018).
     
  • Analyzing Data Relevance ... - Download
    Boissier, M., Meyer, C., Djürken, T., Lindemann, J., Mao, K., Reinhardt, P., Specht, T., Zimmermann, T., Uflacker, M.: Analyzing Data Relevance and Access Patterns of Live Production Database Systems.Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, CIKM 2016. p. 2473--2475. ACM, New York, NY, USA (2016).
     
  • Dynamic and Transparent D... - Download
    Meyer, C., Boissier, M., Michaud, A., Vollmer, J.O., Taylor, K., Schwalb, D., Uflacker, M., Roedszus, K.: Dynamic and Transparent Data Tiering for In-Memory Databases in Mixed Workload Environments.International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures - ADMS @ VLDB 2015 (2015).
     
  • Optimizing Main Memory Ut... - Download
    Boissier, M.: Optimizing Main Memory Utilization of Columnar In-Memory Databases Using Data Eviction.Proceedings of Phd Workshop @ VLDB 2014, Hangzhou (2014).
     

Teaching

Lectures and Seminars:

Supervised Master Theses:

  • "Automatic Clustering in Hyrise." (ongoing)
  • "Learned Cost Models for Query Optimization" (finished in March 2019)
  • "Improving Cardinality Estimation and Access Avoidance in Hyrise" (finished in November 2018)
  • "Data-Driven Ordering and Dynamic Pricing Competition on Online Marketplaces" (finished in May 2018)
  • "Probabilistic Data Structures for In-Memory Databases" (finished in May 2018)
  • "Maintainable and Self-Adapting Column Compression Schemes for HTAP Databases" (finished in April 2018)
  • "Optimizing Database Scan Performance through Access Avoidance in Chunk-Based Databases using Multi-Dimensional Filters" (finished in August 2017)
  • "Predicting movie success before release – Using individualized econometric models to predict box office performance." (finished in January 2017)
  • "Workload-Aware Partitioning and Query Pruning for Mixed Workloads on In-Memory Databases" (finished in January 2016)