Prof. Dr. Felix Naumann

Dr. Ziawasch Abedjan


former member






Please visit my new page at MIT CSAIL for the most recent news.

Research Activities


  • Data Profiling
  • Data Mining


Master's Thesis Co-Supervision

  • Benjamin Emde: "Context-Aware Recommendations in Social Networks", 2012
  • Sven Viehmeier: "Incremental Data profiling", 2012/2013
  • Patrick Schulze: "Depth-First Discovery of Functional Dependencies" 2013/2014

Master Project Co-Supervision

  • "Global Relevance Scores for DBpedia Facts", 2012/2013



A Hybrid Approach for Efficient Unique Column Combination Discovery

Papenbrock, Thorsten; Naumann, Felix in Proceedings of the conference on Database Systems for Business, Technology, and Web (BTW) BTW , 2017 .

Unique column combinations (UCCs) are groups of attributes in relational datasets that contain no value-entry more than once. Hence, they indicate keys and serve data management tasks, such as schema normalization, data integration, and data cleansing. Because the unique column combinations of a particular dataset are usually unknown, UCC discovery algorithms have been proposed to find them. All previous such discovery algorithms are, however, inapplicable to datasets of typical real-world size, e.g., datasets with more than 50 attributes and a million records. We present the hybrid discovery algorithm HyUCC, which uses the same discovery techniques as the recently proposed functional dependency discovery algorithm HyFD: A hybrid combination of fast approximation techniques and efficient validation techniques. With it, the algorithm discovers all minimal unique column combinations in a given dataset. HyUCC does not only outperform all existing approaches, it also scales to much larger datasets.
Further Information
Tags hpi hyucc isg profiling unique_column_combinations

Review Activity

  • ACM Transactions on the Web
  • DESWeb 2014