Prof. Dr. Felix Naumann

Julian Risch

I am a Ph.D. student at the Information Systems Group and a member of the HPI Research School. My research focuses on topic modeling and deep learning with applications in the field of text mining, in particular, comment analysis. Further, I am involved in projects on patent classification and book recommendation.

Source code for my publications can be found here and on GitHub.

Contact Information

Prof.-Dr.-Helmert-Straße 2-3
D-14482 Potsdam
Room: F-2.08

Phone: +49 331 5509 272

Email: Julian Risch

Open Master's Theses

I provide supervision for Master's theses in the area of News Comment Analysis, e.g., Toxic Comment Classification, User Engagement Prediction, Comment Recommendation, and Discussion Summarization/Visualization. Feel free to schedule an informal meeting with me to discuss details of these topics and/or your own ideas.


Advised Master's Theses

  • Enriching Document Embeddings With Domain Knowledge
  • Modeling News Commenters for Discussion Recommendation
  • Jointly Learning Document and Label Embeddings for Hierarchically Labeled Text
  • Context-aware Classification of News Comments
  • Quality Management for Online News Comments 


Learning Patent Speak: Investigating Domain-Specific Word Embeddings

Risch, Julian; Krestel, Ralf in Proceedings of the Thirteenth International Conference on Digital Information Management (ICDIM) Seite 63-68 . 2018 .

A patent examiner needs domain-specific knowledge to classify a patent application according to its field of invention. Standardized classification schemes help to compare a patent application to previously granted patents and thereby check its novelty. Due to the large volume of patents, automatic patent classification would be highly beneficial to patent offices and other stakeholders in the patent domain. However, a challenge for the automation of this costly manual task is the patent-specific language use. To facilitate this task, we present domain-specific pre-trained word embeddings for the patent domain. We trained our model on a very large dataset of more than 5 million patents to learn the language use in this domain. We evaluated the quality of the resulting embeddings in the context of patent classification. To this end, we propose a deep learning approach based on gated recurrent units for automatic patent classification built on the trained word embeddings. Experiments on a standardized evaluation dataset show that our approach increases average precision for patent classification by 17 percent compared to state-of-the-art approaches.
Weitere Informationen
Tagsdeep_learning  isg  myown  patent_classification  web_science