Rapid advances in location-acquisition technologies have led to large amounts of trajectory data. These data are the foundation for a broad spectrum of services driven and improved by trajectory data mining (e.g., ride-sharing, soccer analytics). However, for hybrid transactional and analytical workloads, the storing and processing of rapidly accumulated trajectory data is a non-trivial task. Due to high data volumes and data velocity, performance and memory consumption issues have to be addressed.
Based on the required performance of various applications to analyze trajectory data, we develop approaches to optimize columnar relational in-memory database systems to store and process spatio-temporal data. The relational database structure is well suited to store trajectory data in the common sample point format, enables the efficient combination with other data sources (e.g., business data), and allows us to leverage recent developments in query processing and compression techniques for columnar database systems.
In our research, we focus on optimized data structures and selection mechanisms for workload-aware compression schemas to minimize the data footprint of trajectory data. To address the over time changing access patterns, we introduce a framework that divides the spatio-temporal data into partitions of fixed size and applies specific compression techniques based on the data characteristics and the access patterns for each partition. Additionally, we analyze various real-world use cases to evaluate the proposed optimizations.