Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

What is COLT?

Rules, as created by systems such as AMIE or RUDIK, are useful for detecting and curating errors in knowledge bases. However, many knowledge bases are created automatically resp. semi-automatically and often contain incorrect entries. When these knowledge bases are then used to automatically derive logical rules, the data quality of the underlying knowledge base also affects the quality of the generated rules. This raises the following question:

How can we and be confident that a rule derived from an imperfect knowledge base is actually good?

Our COLT approach aims to answer this very question. With COLT, we present an approach that leverages deep kernel learning to estimate both the confidence as well as the quality of a rule in terms of its impact on the facts contained in a knowledge base. To estimate the true confidence of a rule, COLT requires only a few user interactions.

Key Contributions

  • We propose Colt, a framework to assess quality and confidence of a rule by using expert-validated facts
  • We enable the conditional application of rules and compute their confidence
  • We establish a connection between our problem, the weighted-coverage problem, and quality-preserving Gaussian processes
  • We show how our interactive learning approach only 20 user interactions, halves the error in confidence obtained with rule learning systems
  • We publish our dataset consisting of 26 rules with more than 23,000 annotated facts.

Talk at WWW 2021

Screenshots

Summary of all rules to be annotated
Interface for the annotation of facts

Publication

Dataset

As part of this project, we provide the largest dataset of manually annotated rules at the time of publication:

Team

Contact

For questions or feedback, please contact Michael Loster.