Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

Estimating Data Integration and Cleaning Effort

The paper "Estimating Data Integration and Cleaning Effort" (see Publications) researches how the effort for data integration and data cleaning activities for a concrete integration scenario can be estimated upfront, i.e., before performing the actual integration. This page provides the necessary files to reenact the experiments from this paper.

Available files

Efessource code and executable binary of the effort estimation prototype
Integration scenariosthe datasets and schema mappings used for the experiments
Effort measurementssummary of the spent effort for actually performing the integration scenarios

Links

FreeDBused to create sources and targets in the discographic integration case studies
MusicBrainzused to create sources and targets in the discographic integration case studies
discogsused to create sources and targets in the discographic integration case studies
amalgamused to create sources and targets in the bibliographic integration case studies