Using Non-Volatile Memory to Extend the Capacity of In-Memory Databases
Existing byte-addressable NVM technologies have four times the capacity of DRAM DIMMs, allowing IMDBs to scale beyond their current capacity limitations. At the same time, they are slower both in terms of latency and bandwidth, so a balance between performance and capacity needs to be found. The overlying question of this work is how IMDBs may store most of their data on NVM while keeping the performance sacrifice to a minimum. As part of this, we make three contributions:
First, we present the entirely re-written research IMDB Hyrise. To be able to measure the performance impact of DRAM/NVM tiering, we built a system that allows us to gather statistics relevant for NVM placement decisions, transparently move data from DRAM to NVM and back, and evaluate the placement end-to-end. Hyrise not only serves as the experimental platform in this thesis, but is also used for other research in our group, by industry partners for internal evaluations, as well as in our Master's courses.
Second, we evaluate data structures found in IMDBs (e.g., uncompressed data, dictionaries and attribute vectors, secondary indexes) for how storing them on NVM affects their performance. We show that moving rarely accessed and sequentially loaded data to NVM has only a small performance impact. Different from what most research on NVM data structures focuses on, we show that these are often written only once, enabling us to relax some of the consistency requirements imposed on general-purpose data structures.
Third, we present an automatic and transparent NVM placement mechanism and evaluate it using synthetic and real-world workloads. We evaluate how many of the system's memory slots can be filled with NVM instead of DRAM without significant impacts on the latency and throughput of the database. Looking beyond data placement for existing data, we show how the added capacity also allows us to create additional secondary data structures that were not cost-effective on DRAM but become feasible thanks to NVM.