Visualisations are supposed to provide intuitive ways to explore large document collections. State-of-the-art approaches usually transform high-dimensional representations of documents into 2-dimensional vectors using dimensionality reduction algorithms. These vectors are then placed into a landscape hopefully retaining semantic information regarding similarity from the high-dimensional representation. Traditionally, dimensionality reduction algorithms are developed with static collections in mind. However, many ``real-world'' document collections, such as news articles, scientific literature, patents, Wikipedia, or tweets, to name a few, grow and evolve over time. Visualising the temporal change of these collections poses various challenges for out-of-the-box dimensionality reduction algorithms. In this paper, we propose strategies to adapt existing dimensionality reduction algorithms to incorporate change. These strategies ensure that landscapes at different intervals of the collection are robust with regard to spatio-temporal coherence. Furthermore, we propose metrics to measure the stability over time and compare several popular dimensionality reduction algorithms.
Watch our new MOOC in German about hate and fake in the Internet ("Trolle, Hass und Fake-News: Wie können wir das Internet retten?") on openHPI (link).
Our work on Measuring and Comparing Dimensionality Reduction Algorithms for Robust Visualisation of Dynamic Text Collections will be presented at CHIIR 2021.
I added some photos from my trip to Hildesheim.