Ralf Krestel

You are here: Home > Publications > Workshop Papers > PST 21

PST 21

PatentMatch: A Dataset for Matching Patent Claims & Prior Art

Abstract

Patent examiners need to solve a complex information retrieval task when they assess the novelty and inventive step of claims made in a patent application. Given a claim, they search for prior art, which comprises all relevant publicly available information. This time-consuming task requires a deep understanding of the respective technical domain and the patent-domain-specific language. For these reasons, we address the computer-assisted search for prior art by creating a training dataset for supervised machine learning called PatentMatch. It contains pairs of claims from patent applications and semantically corresponding text passages of different degrees from cited patent documents. Each pair has been labeled by technically-skilled patent examiners from the European Patent Office. Accordingly, the label indicates the degree of semantic correspondence (matching), i.e., whether the text passage is prejudicial to the novelty of the claimed invention or not. Preliminary experiments using a baseline system show that PatentMatch can indeed be used for training a binary text pair classifier and a dense passage retriever on this challenging information retrieval task. The dataset is available online: https://hpi.de/naumann/s/patentmatch.

Full Paper

PST21.pdf

Workshop Homepage

PST 2021

BibTex Entry

@inProceedings{krestel-pst21, author = {Risch, Julian and Alder, Nicolas and Hewel, Christoph and Krestel, Ralf}, booktitle = {Proceedings of the 2nd Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech), Workshop at SIGIR}, title = {PatentMatch: A Dataset for Matching Patent Claims \& Prior Art}, location = {online}, OPTmonth = {July 15th}, year = {2021} }

« prev| top| next »

News

Watch our new MOOC in German about hate and fake in the Internet ("Trolle, Hass und Fake-News: Wie können wir das Internet retten?") on openHPI (link).

New Publication

Our work on Measuring and Comparing Dimensionality Reduction Algorithms for Robust Visualisation of Dynamic Text Collections will be presented at CHIIR 2021.

New Photos

I added some photos from my trip to Hildesheim.