Prof. Dr. Felix Naumann


This dataset contains the annotated facts used to run the experiments presented in our paper "Few-Shot Knowledge Validation Using Rules" (WWW'21). It includes 26 annotated rules (22 postitive, 4 negative) covering 23324 triples (instances) in its entirety. Both the rule and instance data are represented in the JSON format and are contained in the rules.json and instances.json files. A brief description of the most important data fields is given below.

Rules (rules.json)

  • _id - is a unique ObjectID for the rule (as automatically provided by MongoDB)
  • rule_type - indicates whether the rule is positive or negative (true corresponds to positive, false to negative)
  • premise - the premise of the rule
  • conclusion - the conclusion of the rule
  • query_pattern - the query pattern used to execute the rule

Instances / Triples (instances.json)

  • _id - is a unique ObjectID for the instance (as automatically provided by MongoDB)
  • rule -  is a unique ObjectID that identifies the rule that generated this instance
  • subj - the subject of the instance/triple 
  • pred - the predicate of the instance/triple
  • obj - the object of the instance/triple
  • correct - a boolean value indicating whether the fact is correct or incorrect (true corresponds to correct, false to incorrect)
  • label - an integer value indicating whether the fact is correct or incorrect (1 corresponds to correct, 0 to incorrect)
  • score - was intended for future experiments and can be safely ignored.