The efficient discovery of functional dependencies in tables is a well-known challenge in database research and has seen several approaches.
FDHits is a new FD discovery algorithm that models the problem as an enumeration problem of hitting sets.
A publication on this algorithm is currently under review at PVLDB Vol. 14.
You can find the source code of FDHits here and all datasets that we used for the evaluation here.
An experimental evaluation of seven algorithms
The following seven FD algorithms have been evaluated as part of an experimental evaluation published in PVLDB Vol. 8.
|Lattice traversal algorithms:|
|Difference- and agree-set algorithms:|
|Dependency induction algorithms:|
All seven FD algorithms can be executed with the data profiling tool Metanome. A Metanome build in version 0.0.2 can be downloaded here.