Cardinality Estimation
Repeatability: Cardinality Estimation: An Experimental Survey
This is a repeatability page for the paper:
Harmouch, H., Naumann, F.: Cardinality Estimation: An Experimental Survey. Proceedings of the VLDB Endowment (PVLDB). pp. 499 - 512 (2017).
The algorithms are provided in the state their results have been published, but they may not represent the most recent version of their implementations.
Metanome: How To`s
- What is Metanome?
- How to use Metanome GUI?
- How to compiling Metanome and the algorithms from source?
- GitHub repositories (Metanome and Profiling Algorithms).
- How to repeate the experiments in the paper?
Materials
- Results.
- VLDB 2018 talk.
- The tool we used to plot the figures (MagicPlot Student).
Algorithms
The following twelve algorithms are the most popular and well-know cardinality estimation algorithms:
Flajolet and Martin (FM) | 1985 | jar | code |
Probabilistic counting with stochastic averaging(PCSA) | 1985 | jar | code |
Linear Counting (LC) | 1990 | jar | code |
Alon, Martias and Szegedy (AMS) | 1996 | jar | code |
Baryossef, Jayram, Kumar, Sivakumar and Trevisan(BJKST) | 2002 | jar | code |
| 2003 | jar | code | |
| 2003 | jar | code | |
| 2005 | jar | code | |
| 2007 | jar | code | |
| 2008 | jar | code | |
| 2010 | jar | code | |
| 2013 | jar | code | |
Baseline used a hash table | jar | ||
| jar |
Datasets
All algorithms have been exhaustively tested on the following datasets:
| Dataset | #Attributes | #Tuples |
|---|---|---|
| 25 (of 71) | 7,560,886 | |
| 11 | 93,849,474 |
As well as 90 synthetic datasets were generated by the Mersenne Twister random number generator.
Contact
For any question, please contact Hazar Harmouch.