This dataset contains the annotated facts used to run the experiments presented in our paper "Few-Shot Knowledge Validation Using Rules" (WWW'21). It includes 26 annotated rules (22 postitive, 4 negative) covering 23324 triples (instances) in its entirety. Both the rule and instance data are represented in the JSON format and are contained in the rules.json and instances.json files. A brief description of the most important data fields is given below.
Rules (rules.json)
- _id - is a unique ObjectID for the rule (as automatically provided by MongoDB)
- rule_type - indicates whether the rule is positive or negative (true corresponds to positive, false to negative)
- premise - the premise of the rule
- conclusion - the conclusion of the rule
- query_pattern - the query pattern used to execute the rule
Instances / Triples (instances.json)
- _id - is a unique ObjectID for the instance (as automatically provided by MongoDB)
- rule - is a unique ObjectID that identifies the rule that generated this instance
- subj - the subject of the instance/triple
- pred - the predicate of the instance/triple
- obj - the object of the instance/triple
- correct - a boolean value indicating whether the fact is correct or incorrect (true corresponds to correct, false to incorrect)
- label - an integer value indicating whether the fact is correct or incorrect (1 corresponds to correct, 0 to incorrect)
- score - was intended for future experiments and can be safely ignored.