Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
  
 

Project Description

Today's business communication is almost unimaginable without emails. They document discussions and decisions or summarise face-to-face meetings in the form of unstructured text or attachments and thus hold a significant amount of information about a business. In very exceptional cases, for example when investigating a known case of fraud, specialists examine inboxes and attached files of involved personnel to determine the extent of the situation. However, the sheer quantity of documents is unmanageable without some guidance by an exploration tool, as journalists working with the Panama Papers leak experienced.

In this project, we develop and evaluate information extraction and linking methods to combine and in an exploration tool. This work touches the fields of text mining, text summarisation, document classification, topic modelling, named entity extraction, entity linking, relationship extraction, as well as social network-, and graph analysis. We work together with our industry partner from the financial sector to put our prototypes in the hands of auditors for real world feedback.

Subprojects

Project-Related Publications

  • Repke, T., Krestel, R., Edding, J., Hartmann, M., Hering, J., Kipping, D., Schmidt, H., Scordialo, N., Zenner, A.: Beacon in the Dark: A System for Interactive Exploration of Large Email Corpora. Proceedings of the International Conference on Information and Knowledge Management (CIKM). p. 1--4. ACM (2018).
     
  • Repke, T., Krestel, R.: Topic-aware Network Visualisation to Explore Large Email Corpora. International Workshop on Big Data Visual Exploration and Analytics (BigVis). CEUR-WS.org (2018).
     
  • Repke, T., Krestel, R.: Bringing Back Structure to Free Text Email Conversations with Recurrent Neural Networks. 40th European Conference on Information Retrieval (ECIR 2018). Springer, Grenoble, France (2018).
     
  • Zuo, Z., Loster, M., Krestel, R., Naumann, F.: Uncovering Business Relationships: Context-sensitive Relationship Extraction for Difficult Relationship Types. Proceedings of the Conference "Lernen, Wissen, Daten, Analysen" (LWDA) (2017).
     
  • Repke, T., Loster, M., Krestel, R.: Comparing Features for Ranking Relationships Between Financial Entities Based on Text. Proceedings of the 3rd International Workshop on Data Science for Macro--Modeling with Financial and Economic Datasets. p. 12:1--12:2. ACM, New York, NY, USA (2017).