Motivation
Earlier bachelor projects focused on how an in-memory data layer can change the way “traditional” transactional enterprise applications work. By eliminating redundantly stored aggregates, views, and indices, the applications became 10-100 times faster while extending application functionality and improving flexibility. But with steadily growing data volumes, keeping the complete data of a company in main memory becomes increasingly expensive. Analyses of enterprise workloads show that deprecated transactional data (e.g., solved issues in a ticket system or paid invoices) is accessed very rarely. This poses the question how deprecated data (i.e., so-called cold data) can be recognized and moved to less expensive storage (e.g., solid state disks). While storing cold data on secondary storage lower the system's memory requirement, it also introduces new challenges in respect to query performance on cold data which have to be dealt with.
Goal
This Bachelor's project focusses on ways to recognize data that is rarely accessed in an enterprise's daily workload and can thus be moved to secondary storage. Furthermore, the project evaluates different approaches for storing and accessing cold data to retain query performance on both hot and cold data.