CIKM 18

Beacon in the Dark: A System for Interactive Exploration of Large Email Corpora

Abstract

Emails play a major role in today's business communication, documenting not only work but also decision making processes. The large amount of heterogeneous data in these email corpora renders manual investigations by experts infeasible. Auditors or jornalists, e.g., who are looking for irregular or inappropriate content or suspicous patterns, are in desperate need for computer-aided exploration tools to support their investigations. We present our Beacon system for the exploration of such corpora at different levels of detail. A distributed processing pipeline combines text mining methods and social network analysis to augment the already semi-structured nature of emails. The user interface ties into the resulting cleaned and enriched dataset. For the interface design we identify three objectives expert users have: gain an initial overview of the data to identify leads to investigate, understand the context of the information at hand, and have meaningful filters to iteratively focus onto a subset of emails. To this end we make use of interactive visualisations for rearranging and aggregating the extracted information to reveal salient patterns.

Demo Paper

CIKM18.pdf

Conference Homepage

CIKM-18

BibTex Entry