Kossmann, J., Boissier, M., Dubrawski, A., Heseding, F., Mandel, C., Pigorsch, U., Schneider, M., Schniese, T., Sobhani, M., Tsayun, P., Wille, K., Perscheid, M., Uflacker, M., Plattner, H.: A Cockpit for the Development and Evaluation of Autonomous Database Systems.IEEE 37th International Conference on Data Engineering (ICDE 2021), to appear (2021).
Richly, K., Schlosser, R., Boissier, M.: Joint Index, Sorting, and Compression Optimization for Memory-Efficient Spatio-Temporal Data Management.IEEE 37th International Conference on Data Engineering (ICDE 2021), to appear (2021).
The wide distribution of location-acquisition technologies has led to large volumes of spatio-temporal data, which are the foundation for a broad spectrum of applications. Based on these applications' performance requirements, in-memory databases are used to store and process the data. As DRAM capacities are limited and expensive, modern database systems apply various configuration optimizations (e.g., compression) to reduce the memory footprint. The selection of cost and performance balancing configurations is challenging due to the vast amount of possible setups consisting of mutually dependent individual decisions. In this paper, we present a linear programming approach to determine fine-grained configuration decisions for spatio-temporal workloads. By dividing the data into partitions of fixed size, we can apply the compression, sorting, and index selections on a fine-grained level to reflect spatio-temporal access patterns. Our approach jointly optimizes these configurations to maximize performance under a given memory budget. We demonstrate on a real-world dataset that models specifically optimized for spatio-temporal data characteristics allow us to reduce the memory footprint (up to 60% by equal performance) and increase the performance (up to 80% by equal memory size) compared to established rule-based heuristics.
Dreseler, M., Boissier, M., Rabl, T., Uflacker, M.: Quantifying TPC-H Choke Points and Their Optimizations.Proceedings of the VLDB Endowment. pp. 1206-1220 (2020).
Boissier, M., Jendruk, M.: Workload-Driven and Robust Selection of Compression Schemes for Column Stores.22nd International Conference on Extending Database Technology (EDBT). pp. 674-677 (2019).
Schlosser, R., Kossmann, J., Boissier, M.: Efficient Scalable Multi-Attribute Index Selection Using Recursive Strategies.IEEE 35th International Conference on Data Engineering (ICDE 2019). pp. 1238-1249. IEEE (2019).
Dreseler, M., Kossmann, J., Boissier, M., Klauck, S., Uflacker, M., Plattner, H.: Hyrise Re-engineered: An Extensible Database System for Research in Relational In-Memory Data Management.22nd International Conference on Extending Database Technology (EDBT). pp. 313-324 (2019).
Boissier, M.: Reducing the Footprint of Main Memory HTAP Systems: Removing, Compressing, Tiering, and Ignoring Data.PhD Workshop at VLDB. CEUR-WS.org (2018).
Gracefully reducing the main memory footprint (e.g., via compression and data tiering) is an unsolved challenge for HTAP database systems, where most traditional reduction methods are no longer applicable. Since advantages of a reduced footprint are manyfold, the issue is of high importance for main memory-resident databases. In this paper, we present our work on Hyrise and discuss how we break down this challenge into three aspects in order to reduce the main memory consumption without losing the performance advantage of in-memory databases. First, we reduce existing allocations by efficiently selecting indices and workload-driven compression configurations for table data. Second, we use hybrid table layouts to tier data with minimal impacts on the runtime performance. Third, we employ data structures that efficiently eliminate unnecessary accesses to secondary storage. As an outlook, we depict our vision to eventually unite these aspects in a holistic footprint optimization.
Schlosser, R., Boissier, M.: Dealing with the Dimensionality Curse in Dynamic Pricing Competition: Using Frequent Repricing to Compensate Imperfect Market Anticipations.Computers and Operations Research.100,26-42 (2018).
Most sales applications are characterized by competition and limited demand information. For success- ful pricing strategies, frequent price adjustments as well as anticipation of market dynamics are crucial. Both effects are challenging as competitive markets are complex and computations of optimized pricing adjustments can be time-consuming. We analyze stochastic dynamic pricing models under oligopoly competition for the sale of perishable goods. To circumvent the curse of dimensionality, we propose a heuristic approach to efficiently compute price adjustments. To demonstrate our strategy’s applicability even if the number of competitors is large and their strategies are unknown, we consider different competitive settings in which competitors frequently and strategically adjust their prices. For all settings, we verify that our heuristic strategy yields promising results. We compare the performance of our heuristic against upper bounds, which are obtained by optimal strategies that take advantage of perfect price an- ticipations. We find that price adjustment frequencies can have a larger impact on expected profits than price anticipations. Finally, our approach has been applied on Amazon for the sale of used books. We have used a seller’s historical market data to calibrate our model. Sales results show that our data-driven strategy outperforms the rule-based strategy of an experienced seller by a profit increase of more than 20%.
Ruhrländer, R.P., Boissier, M., Uflacker, M.: Improving Box Office Result Predictions for Movies Using Consumer-Centric Models.KDD '18 Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. pp. 655-664 (2018).
Recent progress in machine learning and related fields like recommender systems open up new possibilities for data-driven approaches. One example is the prediction of a movie’s box office revenue, which is highly relevant for optimizing production and marketing. We use individual recommendations and user-based forecast models in a system that forecasts revenue and additionally provides actionable insights for industry professionals. In contrast to most existing models that completely neglect user preferences, our approach allows us to model the most important source for movie success: moviegoer taste and behavior. We divide the problem into three distinct stages: (i) we use matrix factorization recommenders to model each user’s taste, (ii) we then predict the individual consumption behavior, and (iii) eventually aggregate users to predict the box office result. We compare our approach to the current industry standard and show that the inclusion of user rating data reduces the error by a factor of 2x and outperforms recently published research.
Schlosser, R., Boissier, M.: Dynamic Pricing under Competition on Online Marketplaces: A Data-Driven Approach.KDD '18 Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. pp. 705-714 (2018).
Most online markets are characterized by competitive settings and limited demand information. Due to the complexity of such markets, efficient pricing strategies are hard to derive. We analyze stochastic dynamic pricing models in competitive markets with multiple offer dimensions, such as price, quality, and rating. In a first step, we use a simulated test market to study how sales probabilities are affected by specific customer behaviors and the strategic interaction of price reaction strategies. Further, we show how different state-of-the-art learning techniques can be used to estimate sales probabilities from partially observable market data. In a second step, we use a dynamic programming model to compute an effective pricing strategy which circumvents the curse of dimensionality. We demonstrate that the strategy is applicable even if the number of competitors is large and their strategies are unknown. We show that our heuristic can be tuned to smoothly balance profitability and speed of sales. Further, our approach is currently applied by a large seller on Amazon for the sale of used books. Sales results show that our data-driven strategy outperforms the rule-based strategy of an experienced seller by a profit increase of more than 20%.
Boissier, M., Schlosser, R., Uflacker, M.: Hybrid Data Layouts for Tiered HTAP Databases with Pareto-Optimal Data Placements.IEEE 34th International Conference on Data Engineering (ICDE 2018). pp. 209-220 (2018).
Recent developments in database research intro- duced HTAP systems that are capable of handling both transactional and analytical workloads. These systems achieve their performance by storing the full data set in main memory. An open research question is how far one can reduce the main memory footprint without losing the performance superiority of main memory-resident databases. In this paper, we present a hybrid main memory-optimized database for mixed workloads that evicts cold data to less expensive storage tiers. It adapts the storage layout to mitigate the negative performance impact of secondary storage. A key challenge is to determine which data to place on which storage tier. We introduce a novel workload-driven model that determines Pareto-optimal allocations while also considering reallocation costs. We evaluate our concept for a production enterprise system as well as reproducible data sets.
Boissier, M., Meyer, C., Djürken, T., Lindemann, J., Mao, K., Reinhardt, P., Specht, T., Zimmermann, T., Uflacker, M.: Analyzing Data Relevance and Access Patterns of Live Production Database Systems.Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, CIKM 2016. p. 2473--2475. ACM, New York, NY, USA (2016).
Meyer, C., Boissier, M., Michaud, A., Vollmer, J.O., Taylor, K., Schwalb, D., Uflacker, M., Roedszus, K.: Dynamic and Transparent Data Tiering for In-Memory Databases in Mixed Workload Environments.International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures - ADMS @ VLDB 2015 (2015).
Boissier, M.: Optimizing Main Memory Utilization of Columnar In-Memory Databases Using Data Eviction.Proceedings of Phd Workshop @ VLDB 2014, Hangzhou (2014).