Improving RDF Data through Association Rule Mining
Ziawasch Abedjan and Felix Naumann
Abstract."Linked Open Data comprises very many and often large public data sets, which are mostly presented in the RDF triple structure of subject, predicate, and object. However, the heterogeneity of available open data requires significant integration steps before it can be used in applications. A promising and novel technique to explore such data is the use of association rule mining. We introduce “mining configurations”, which allow us to mine RDF data sets in various ways. Different configurations enable us to identify schema and value dependencies that in combination result in interesting use cases. We present rule-based approaches for predicate suggestion, data enrichment, ontology improvement, and query relaxation. On the one hand we prevent inconsistencies in the data through predicate suggestion, enrichment with missing facts, and alignment of the corresponding ontology. On the other hand we support users to handle inconsistencies during query formulation through predicate expansion techniques. Based on these approaches, we show that association rule mining benefits the integration and usability of RDF data."