In-Memory Data Management Research: Data Structures and Algorithms for In-Memory Databases
- Type of course: Master Seminar, Summer Term 2014
- Teaching Staff: Dr. Matthias Uflacker, Martin Faust, David Schwalb
- Location: Villa , 2.16, Hasso-Plattner-High-Tech-Park, August-Bebel-Str. 88 (tbc)
- Time: The seminar will mostly be held in individual meetings. Only the following obligatory meetings are held in the seminar's assigned slot (Tuesdays and Thursdays, 11h00-12h30 (s.t.) ):
- 08.04.2014 - 11:00: Introduction and Topics
- 10.04.2014 - 11:00: Main Memory DB Introduction
- 20.05.2014: Intro Scientific Writing (expected)
- 22./27.05.2014: Mid-term presentation (expected)
- 8./10.07.2014: Final presentation (expected)
- 26.07.2014: Paper hand-in (expected)
- 4 semester periods per week (Semesterwochenstunden)
- 6 credit points (graded)
- Area of specialization: BPET, OSIS, SAMT
The goal of the research seminar is to teach the students the basics of scientific research and a basic knowledge of the inner mechanics of in-memory databases. The seminar is focused on implementation concepts for columnar in-memory database systems on modern hardware. Hence, each student will work individually on a topic, resulting in a final paper (6-8 pages, IEEE) in addition to a midterm and final presentation. The topics vary from basic data structure concepts to in-memory optimised algorithms. Each topic will have an implementation component, which should be implemented and closely evaluated in the resulting paper.
All seminar topics will be implemented using the prototypical main memory database engine Hyrise. Hyrise started as a small prototype with limited functionality, but is extended more and more to provide features of full database management systems. It supports all functionality needed to execute TPC-C and TPC-H including traditional mechanisms for persistency using logging and experimental integration of non-volatile memory. The TPC workloads will be used to evalute the implemented work during the seminar. Hyrise is also available as an open source project on Github. Primary Key/ Unique Indices. Hyrise currently does not enforce Primary Key/Unique Constraints. In this work the student implements different indexing structures that efficiently allow to check uniqueness constraints. Possible implementations are the resorting of the table (Clustered Primary Key) or additional tree, trie or hash structures. Trie Index/ Radix Tree. A Radix Tree is a space efficient index structure. The goal of this work is to integrate such a data structure into Hyrise and compare the storage space and performance to alternatives. To achieve maximum performance the student should implement a lock free radix tree structure. Compound Keys. Hyrise currently has basic support for compound indices. In this topic, you will improve the current merge implementation to support the efficient merge of multi-column indices. The performance of the merge will be determined both theoretically and through benchmarks, and parameters that impact the index merge are qualified. Part of tuning the index implementation is to measure its performance. FusionIO Auto Commit Memory. FusionIO Auto Commit Memory (ACM) allows to map files from Fusion ioDrives into memory, modify them on a byte level and to automatically sync changes to persistent storage. In this work, you integrate ACM into Hyrise to achieve durability of the database log. Also, in the next step, you will research if an in-memory database can leverage this functionality to achieve durability by keeping its primary data structure on ACM and what the performance implications are. Memory Mapped File Checkpointing. A checkpoint is a consistent snapshot of the database, speeding up recovery in case of failures as not the complete log has to be replayed. In-memory databases with Main/Delta concept need to persist the complete delta for a checkpoint which is a costly operation as it needs to serialize large data structures. In this work, you extend the Checkpointing Algorithm in Hyrise by allocating delta data structures on Memory Mapped files on a Fusion ioDrive and perform a msync of the file for the checkpoint. Optimized Logging for Flash. Traditional ARIES style logging with group commits is optimized for shortcomings of disks, however, modern flash based storage devices have completely different characteristics. Here, the task is to optimize the logging algorithm in Hyrise for flash storage drives like Fusion ioDrive or SSDs and analyse the performance of different implementations. Database Cockpit. The goal of this topic is to develop an applications that shows database performance characteristics and database statistics. The demo app should visualize heartbeats from Hyrise during the execution of a TPC-C workload, showing multiple live charts visualizing query performance and database statistics. Control elements to influence workload are provided by the app, as well as the ability to simulate a crash in the database and show instant recovery features. The app should be implemented with HTML5. Benchmark Framework. The currently available Python-based benchmark framework allows the execution of a TPC-C workload on Hyrise. The tool includes the Apache Benchmark Tool (ab). In this work, the benchmark tool is improved to support multiple workloads and modular experiments (e.g. CH-Benchmark,TPC-CH, TATP). Multi-Event-Loop. Extend the current implementation using one ev-loop so that one loop is instantiated per NUMA-node, as dispatching short running queries to remote memory is costly. As one loop listens on one private HTTP port, this will also require some mechanisms to redirect incoming connections to the respective ev-loops in order to provide one single port for incoming connections. Alternatively, instead of using HTTP, a custom protocol might be developed. Shared Domain Dictionary for improved Join and Update Performance Typically the value-range of key columns is naturally incremental and thus an order-preserving dictionary (having a fixed encoding) can be assumed for in-memory databases. This simplifies update and inserts operations into these columns, as a re-encoding of the data vectors is never required. If such an dictionary is shared among the primary and all foreign key columns of one domain we can improved update as well as join performance for a huge and important group of columns in IMDB systems. Within this master seminar a shared domain dictionary (SDD), as well as an adapted update and join operations should be implemented and evaluated in HYRISE. Tiering Indices based on Workload Statistics for Hot and Cold Data Access. Classifying data based on their relevance is important for in-memory databases to improve memory utilization, system performance, and the handling of increasing workloads and data volumes. Traditional concepts as page-based buffer management for frequently used pages have been dropped for performance reasons in today's IMDBs. Instead, analyzing the actual workload of a database allows to capture and classify the relevance of data. In order to exploit data classification and improve performance, we need to avoid accesses to cold data partitions as much as possible and ensure that queries on hot partitions only still yield correct results. In this master seminar different tiering indices for hot and cold data classification and access should be elaborated and implemented in HYRISE.
The following components determine the final mark:
| Part||Valuation in % ||Type|
| Presentations (Mid-term / Final) || 30 (10 / 20) || Personal grade |
| Results || 30 || Personal grade |
| Article || 30 || Personal grade |
| General participation in the seminar || 10 || Personal grade |
All of the components must be passed in order to pass the seminar.