Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
  
 

Publications (sorted in inverse chronological order)

DFD: Efficient Discovery of Functional Dependencies

Abedjan, Ziawasch; Schulze, Patrick; Naumann, Felix in In Proceedings of the International Conference on Information and Knowledge Management (CIKM), Shanghai, China page 949-958 . 2014 .

The discovery of unknown functional dependencies in a dataset is of great importance for database redesign, anomaly detection and data cleansing applications. However, as the nature of the problem is exponential in the number of attributes none of the existing approaches can be applied on large datasets. We present a new algorithm DFD for discovering all functional dependencies in a dataset following a depth-first traversal strategy of the attribute lattice that combines aggressive pruning and efficient result verification. Our approach is able to scale far beyond existing algorithms for up to 7.5 million tuples, and is up to three orders of magnitude faster than existing approaches on smaller datasets. Winner of the CIKM 2014 Best Student Paper Award
DFD_CIKM2014_p949_CRC.pdf
Further Information
Tags hpi isg profiling