Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

Project Description

Deep neural networks can be used to create representations for words, sentences, and documents, as well as for entities, relations, and many more. They provide a dense vector to represent high-dimensional, sparse data in a compact way. Such embedding models have been show to improve the results of many text mining tasks. Further, combining these representations can reveal new insights. We investigate how these models can be used for text mining and develop new models for specific text mining tasks, such as splitting of e-mail threads.

Subprojects

Project-Related Publications

  • 1.
    Loster, M., Mottin, D., Papotti, P., Naumann, F., Ehmueller, J., Feldmann, B.: Few-Shot Knowledge Validation using Rules. Proceedings of the Web Conference (2021).
     
  • 2.
    Risch, J., Krestel, R.: Domain-specific word embeddings for patent classification. Data Technologies and Applications. 53, 108–122 (2019).
     
  • 3.
    Risch, J., Krestel, R.: Learning Patent Speak: Investigating Domain-Specific Word Embeddings. Proceedings of the Thirteenth International Conference on Digital Information Management (ICDIM). pp. 63–68 (2018).
     
  • 4.
    Bunk, S., Krestel, R.: WELDA: Enhancing Topic Models by Incorporating Local Word Contexts. Joint Conference on Digital Libraries (JCDL 2018). ACM, Forth Worth, Texas, USA (2018).
     
  • 5.
    Repke, T., Krestel, R.: Topic-aware Network Visualisation to Explore Large Email Corpora. International Workshop on Big Data Visual Exploration and Analytics (BigVis). (2018).
     
  • 6.
    Repke, T., Krestel, R.: Bringing Back Structure to Free Text Email Conversations with Recurrent Neural Networks. 40th European Conference on Information Retrieval (ECIR 2018). Springer, Grenoble, France (2018).