RECSYS 13

Recommending Patents based on Latent Topics

Abstract

The availability of large volumes of granted patent documents and patent applications, all publicly available on the Web, enables the use of sophisticated text mining and information retrieval methods to facilitate access and analysis of patents. A key task when dealing with patents is to find related or similar patents. This usually requires domain experts who are also familiar with existing patents. In this paper, we investigate techniques to automatically assess the similarity of patents, which is critical for a variety of patent-related tasks. We propose the use of latent Dirichlet allocation and Dirichlet multinomial regression to represent documents and to compute similarity scores. We show how these scores can be used to provide assistance in typical patent mining scenarios such as prior art recommendation and citation prediction. We compare our methods with state-of-the-art document representations and retrieval techniques and demonstrate the effectiveness of our approach on a collection of US patent publications.

Full Paper

RECSYS13.pdf

Conference Homepage

RecSys 2013

BibTex Entry