Enabling automated data integration and cleaning has been a fundamental research goal for several decades because the requirements heavily depend on the application scenario. Recent learning-based techniques rely on heavy parameter tuning by experts or the provision of large amounts of labeled data, impeding their deployment in ad-hoc integration workflows.
In my talk, I discuss how we overcome the aforementioned problems by building systems that follow example-based and declarative paradigms.