Prof. Dr. Felix Naumann

Toni Gruetze

former member

Toni Gruetze

Contact Information

Prof.-Dr.-Helmert-Straße 2-3
D-14482 Potsdam
Room: G-3.2.09

Phone: +49 331 5509 237

Email: Toni Gruetze

Research Interests

  • Web Mining
  • Distributed Computing
  • Information Retrieval
  • Machine Learning
  • Recommender Systems


  • Master's theses:
    • "Large-Scale Twitter Hashtag Recommendation for Documents" by Gary Yao, 2014
    • "Context-based Tweet Recommendation for News Articles" by Alexander Spivak, 2016
    • "Large-scale topic-based analysis of political discussions on Twitter" by Jaqueline Pollak, 2017


CohEEL: Coherent and Efficient Named Entity Linking through Random Walks

Gruetze, Toni; Kasneci, Gjergji; Zuo, Zhe; Naumann, Felix in Web Semantics: Science, Services and Agents on the World Wide Web 2016 .

In recent years, the ever-growing amount of documents on the Web as well as in digital libraries led to a considerable increase of valuable textual information about entities. Harvesting entity knowledge from these large text collections is a major challenge. It requires the linkage of textual mentions within the documents with their real-world entities. This process is called entity linking. Solutions to this entity linking problem have typically aimed at balancing the rate of linking correctness (precision) and the linking coverage rate (recall). While entity links in texts could be used to improve various Information Retrieval tasks, such as text summarization, document classification, or topic-based clustering, the linking precision is the decisive factor. For example, for topic-based clustering a method that produces mostly correct links would be more desirable than a high-coverage method that leads to more but also more uncertain clusters. We propose an efficient linking method that uses a random walk strategy to combine a precision-oriented and a recall-oriented classifier in such a way that a high precision is maintained, while recall is elevated to the maximum possible level without affecting precision. An evaluation on three datasets with distinct characteristics demonstrates that our approach outperforms seminal work in the area and shows higher precision and time performance than the most closely related state-of-the-art methods.
Weitere Informationen
TagsEntity_Linking  Machine_Learning  Named_Entity_Disambiguation  Random_Walk  isg