Abstract for CurEx Paper
The integration of diverse structured and unstructured information sources into a unified, domain-specific knowledge base is an important task in many areas.
A well-maintained knowledge base enables data analysis in complex scenarios, such as risk analysis in the financial sector or investigating large data leaks, such as the Paradise or Panama papers.
Both the creation of such knowledge bases, as well as their continuous maintenance and curation involves many complex tasks and considerable manual effort.
Since the integration process can involve errors, it becomes necessary for the users to manually correct the erroneous information contained in the knowledge base.
With CurEx, we present a modular system that allows structured and unstructured data sources to be integrated into a domain-specific knowledge base.
In particular, we (i) enable the incremental improvement of each individual integration component;
(ii) enable the selective generation of multiple knowledge graphs from the information contained in the knowledge base;
and (iii) provide two distinct user interfaces tailored to the needs of data engineers and end-users respectively.
The former has curation capabilities and controls the integration process, whereas the latter focuses on the exploration of the generated knowledge graph.