Natural Language Processing for Patent Retrieval
Granted patents form an extensive knowledge base for information retrieval, which is an interesting research field for academia and industry. Especially domain-specific terminology is challenging for state-of-the-art approaches. Therefore, this master’s thesis focuses on document representations that are able to capture a patent’s topics. These representations are the basis for a patent retrieval algorithm.
In this master thesis, you will jointly mine the topical aspect, but also the spatial aspect of a dataset of 5 million patents, in order to improve current retrieval models. For example, the inventor’s address can be geocoded to the actual geolocation, so that regional patterns can be found. Besides regional patterns, you will analyse patent topics with regard to changes over time. Therefore, you will deal with topic modeling, document embedding, and geocoding.
This Master's thesis will be jointly supervised by Julian Risch and Ioannis Koumarelas.