Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
  
 

Hazar Harmouch

Research Assistant , PhD Candidate

Infomation Systems Research Group 
Hasso-Plattner-Institut | Universität Potsdam
Prof.-Dr.-Helmert-Straße 2-3, D-14482 Potsdam

Contact Information

Research Interests

  • Web Tables similarity and Table joins
  • Data Profiling
  • Data Mining
  • Big Data

Projects

Teaching

Publications

Cardinality Estimation: An Experimental Survey

Harmouch, Hazar; Naumann, Felix in Proceedings of the VLDB Endowment (PVLDB) volume   11   of   11 , page 499 - 512 . 2017 .

Data preparation and data profiling comprise many both basic and complex tasks to analyze a dataset at hand and extract metadata, such as data distributions, key candidates, and functional dependencies. Among the most important types of metadata is the number of distinct values in a column, also known as the zeroth-frequency moment. Cardinality estimation itself has been an active research topic in the past decades due to its many applications. The aim of this paper is to review the literature of cardinality estimation and to present a detailed experimental study of twelve algorithms, scaling far beyond the original experiments. First, we outline and classify approaches to solve the problem of cardinality estimation- we describe their main idea, error guarantees, advantages, and disadvantages. Our experimental survey then compares the performance all twelve cardinality estimation algorithms. We evaluate the algorithms' accuracy, runtime, and memory consumption using synthetic and real-world datasets. Our results show that different algorithms excel in different in categories, and we highlight their trade-offs
[ URL ] [ DOI ]
Cardinality Estimation: A... - Download
Further Information
Tags cardinality  distinct_value  hpi  isg  profiling