Estimating Data Integration and Cleaning Effort
The paper "Estimating Data Integration and Cleaning Effort" (see Publications) researches how the effort for data integration and data cleaning activities for a concrete integration scenario can be estimated upfront, i.e., before performing the actual integration. This page provides the necessary files to reenact the experiments from this paper.
Authors
- Sebastian Kruse
- Paolo Papotti
- Felix Naumann
Available files
| Efes | source code and executable binary of the effort estimation prototype |
| Integration scenarios | the datasets and schema mappings used for the experiments |
| Effort measurements | summary of the spent effort for actually performing the integration scenarios |
Links
| FreeDB | used to create sources and targets in the discographic integration case studies |
| MusicBrainz | used to create sources and targets in the discographic integration case studies |
| discogs | used to create sources and targets in the discographic integration case studies |
| amalgam | used to create sources and targets in the bibliographic integration case studies |