Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
  
 

Tobias Bleifuß

Ph.D. student at the Information Systems Research Group

Contact Information

Prof.-Dr.-Helmert-Straße 2-3
D-14482 Potsdam
Room: G-3.1.09 Directions

Phone: +49 331 5509 1303

Email: Tobias Bleifuß

Research Interests

  • Data Profiling
  • Data Change Exploration
  • Data Quality
  • Query Processing

Publications

Approximate Discovery of Functional Dependencies for Large Datasets

Bleifuß, Tobias; Bülow, Susanne; Frohnhofen, Johannes; Risch, Julian; Wiese, Georg; Kruse, Sebastian; Papenbrock, Thorsten; Naumann, Felix in Proceedings of the International Conference on Information and Knowledge Management (CIKM) page 1803-1812 . New York, NY, USA , ACM , 2016 .

Functional dependencies (FDs) are an important prerequisite for various data management tasks, such as schema normalization, query optimization, and data cleansing. However, automatic FD discovery entails an exponentially growing search and solution space, so that even today’s fastest FD discovery algorithms are limited to small datasets only, due to long runtimes and high memory consumptions. To overcome this situation, we propose an approximate discovery strategy that sacrifices possibly little result correctness in return for large performance improvements. In particular, we introduce AID-FD, an algorithm that approximately discovers FDs within runtimes up to orders of magnitude faster than state-of-the-art FD discovery algorithms. We evaluate and compare our performance results with a focus on scalability in runtime and memory, and with measures for completeness, correctness, and minimality.
[ URL ]
fd_paper.pdf
Further Information
Tags approximate discovery functional_dependencies hpi isg profiling