Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI

Aragats Amirkhanyan

Methods and Frameworks for GeoSpatioTemporal Data Analytics

In the era of social networks, internet of things and location-based services, many online services produce a huge amount of data that have valuable objective information, such as geographic coordinates and date time. These characteristics (parameters) in the combination with a textual parameter bring the challenge for the discovery of geospatiotemporal knowledge. This challenge requires efficient methods for clustering and pattern mining in spatial, temporal and textual spaces. In this thesis, we address the challenge of providing methods and frameworks for geospatiotemporal data analytics. As an initial step, we address the challenges of geospatial data processing: data gathering, normalization, geolocation, and storage. That initial step is the basement to tackle the next challenge — geospatial clustering challenge. The first step of this challenge is to design the method for online clustering of georeferenced data. This algorithm can be used as a server-side clustering algorithm for online maps that visualize massive georeferenced data. As the second step, we develop the extension of this method that considers, additionally, the temporal aspect of data. For that, we propose the density and intensity-based geospatiotemporal clustering algorithm with fixed distance and time radius. Each version of the clustering algorithm has its own use case that we show in the thesis. In the next chapter of the thesis, we look at the spatiotemporal analytics from the perspective of the sequential rule mining challenge. We design and implement the framework that transfers data into textual geospatiotemporal data - data that contain geographic coordinates, time and textual parameters. By this way, we address the challenge of applying pattern/rule mining algorithms in geospatiotemporal space. As the applicable use case study, we propose spatiotemporal crime analytics — discovery spatiotemporal patterns of crimes in publicly available crime data. The second part of the thesis, we dedicate to the application part and use case studies. We design and implement the application that uses the proposed clustering algorithms to discover knowledge in data. Jointly with the application, we propose the use case studies for analysis of georeferenced data in terms of situational and public safety awareness.