Hyrise2 in its vanilla version is a fully DRAM-resident database. But with ever growing data volumes being processed by modern applications, DRAM contingents of single servers (and even large clusters) are easily exceeded. Not even mentioning hardware costs of storing all data in comparatively expensive DRAM. The focus point of Footprint Reduction studies topics that aim to reduce the DRAM space used by Hyrise2.
While other storage tiers besides DRAM are less expensive, they also introduce performance penalties. Consequently, the question is to which extend we can lower the DRAM footprint without loosing the performance superiority of DRAM-resident databases. And with that: how can we balance this trade-off?
Our research focusses on two areas that we work on separately in order to limit dependencies and the database's complexity:
- Compression: part of this area is the challenge of reducing the required DRAM space for data structures, e.g., via lossless compression or approximate indexing structures.
- Tiering: the goal of data tiering is to (i) classify data as 'tierable' (usually infrequently accessed data) and (ii) find means to efficiently evict this data to secondary storage layers. Furthermore, we study means to avoid unnecessary accesses to secondary storage as often as possible.
We are currently working on several projects that aim to improve the footprint. Amongst them is workload-driven and adaptable column compression scheme, approximate indexing structures for faster access paths and improved cardinality estimations, and filters to improve partition pruning for tiered data.
New Memory Hardware
Memory is the new bottleneck of modern databases. In the time that it takes to retrieve a single memory address from DRAM, hundreds of other non-memory operations can be computed. As such, it becomes increasingly important to make efficient use of the available memory bandwidth. When looking at large NUMA setups where latencies are higher and bandwidths lower, we are exploring ways to make direct use of hardware primitives that allow us to improve the access performance.
Looking at upcoming memory technologies, such as Non-Volatile Memory (NVM), we are implementing concepts that incorporate these into the memory management of the database. We identified two research areas: First, a database that uses NVM as its primary storage can guarantee crash resilience even without a separate log component. When persistence becomes a first-class citizen, this allows us to rethink the way transaction handling and concurrency work. Furthermore, native persistence without the need of replaying a log greatly reduces the time needed to restart and recover the database.
Second, we are looking into how NVM and DRAM can be used side-by-side. In a system that has both types of memory, data has to be dynamically placed so that the advantages and disadvantages of both types are carefully balanced. Some data structure, for example temporary hash tables created by joins are a natural fit for fast, volatile DRAM. Others, such as persistent attribute vectors that are mostly read sequentially, should be placed on NVM instead.
Increasing demand for analytical processing capabilities can be managed by scale-out approaches. In replication schemes, replica nodes process read-only queries on snapshots of the master node without violating transactional consistency. When analyzing workloads, we can identify query access patterns and replicate data depending to its access frequency. In this way, we can reduce the replication cluster’s memory consumption and process the query load at lower cost.
In our research, we investigate approaches to find good replication configurations that can be deployed to scale the workload and reduce the overall storage footprint. Further we compare synchronization methods to keep the replica’s data up-to-date with regards to the transactional changes.
The settings and physical layout of database systems that handle large enterprise workloads need to be configured in order to deliver optimal performance, comply with SLAs, and utilize the underlying hardware’s capabilities to their full extent. These configurations are usually conducted manually by database administrators (DBA). A variety of aspects let manual tuning and configuration appear especially complicated, time-consuming and therefore, expensive.
In realistic databases, there is a large amount of different options to choose from: databases can contain hundreds or thousands of tables. Often, there is a certain interdependence between these options, meaning that they influence each other. In addition, there is no configuration that performs best for all workloads. Every database instance has to be configured individually. Even more, workloads may change over time which makes reconfigurations necessary.
Therefore, we define a set of cost functions and formulate the configuration questions as optimization problems. With data of the database’s internal statistic component, the query plan cache and some parts of the optimizer we investigate how to apply heuristics and solvers that identify optimized solutions for these questions. In the end, these insights can support database administrators or lead to autonomous databases that configure and them themselves.
Graphics Processing Units (GPU) are well suited to processing large amounts of data in parallel. With their large number of processing cores and multiple arithmetic logic units (ALUs), GPUs deliver high computational power and better energy efficiency compared to CPUs. In addition, modern GPUs incorporate high bandwidth memory (HBM), which provides a memory bandwidth that exceeds that of DRAM.
We have identified the following key challenges when it comes to incorporating GPUs into in-memory database management systems for better performance:
- Limited local memory: the available HBM on a GPU is still relatively limited in size (16 GB for the NVIDIA P100)
- Transfer costs: Data residing in DRAM is required to be transferred to the GPU’s HBM, with costs that potentially nullify the benefits of parallel processing on the GPU
- Workload selection: GPUs perform well on large amounts of parallel arithmetic operations. For sequential processing and operations beyond arithmetics, the device cannot reach its peak performance and falls behind CPUs.
Upcoming GPU generations allow us to better cope with challenges. For example, Nvidia is providing a high bandwidth interconnect between CPUs and GPUs called NVLink. NVLink can relax the bandwidth limitation between DRAM and on-chip HBM and is used to create clusters of multiple GPUs operating together. This allows for systems of GPUs with up to 128 GB of on-chip memory (Nvidia DGX-1).
For Hyrise we envision and plan to evaluate a setup in which the compressed chunks of every table are completely moved to GPU memory. All inserts are still handled by the CPU and stored in uncompressed chunks in DRAM. Once a modifiable chunk reaches its maximum capacity, it will be compressed and moved to GPU memory. This is a one-time transfer effort as data residing in compressed chunks is not subject to updates (insert-only).