Schirmer, Philipp; Papenbrock, Thorsten; Kruse, Sebastian; Naumann, Felix; Hempfing, Dennis; Mayer, Torben; Neuschäfer-Rube, Daniel
Proceedings of the International Conference on Extending Database Technology (EDBT)
Functional dependencies (FDs) support various tasks for the management of relational data, such as schema normalization, data cleaning, and query optimization. However, while existing FD discovery algorithms regard only static datasets, many real-world datasets are constantly changing – and with them their FDs. Unfortunately, the computational hardness of FD discovery prohibits a continuous re-execution of those existing algorithms with every change of the data. To this end, we propose DynFD, the first algorithm to dis cover and maintain functional dependencies in dynamic datasets. Whenever the inspected dataset changes, DynFD evolves its FDs rather than recalculating them. For this to work efficiently, we propose indexed data structures along with novel and efficient update operations. Our experiments compare DynFD’s incremental mode of operation to the repeated re-execution of existing, static algorithms. They show that DynFD can maintain the FDs of dynamic datasets over an order of magnitude faster than its static counter-parts.