Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
  
 

Dr. Ziawasch Abedjan

 

former member

 


 

 

 

 

Please visit my new page at MIT CSAIL for the most recent news.

Research Activities

Topics

  • Data Profiling
  • Data Mining

Projects

Master's Thesis Co-Supervision

  • Benjamin Emde: "Context-Aware Recommendations in Social Networks", 2012
  • Sven Viehmeier: "Incremental Data profiling", 2012/2013
  • Patrick Schulze: "Depth-First Discovery of Functional Dependencies" 2013/2014

Master Project Co-Supervision

  • "Global Relevance Scores for DBpedia Facts", 2012/2013

Teaching

Publications

DFD: Efficient Discovery of Functional Dependencies

Ziawasch Abedjan, Patrick Schulze, Felix Naumann
In In Proceedings of the International Conference on Information and Knowledge Management (CIKM), Shanghai, China, pages 949-958, 2014

Abstract:

The discovery of unknown functional dependencies in a dataset is of great importance for database redesign, anomaly detection and data cleansing applications. However, as the nature of the problem is exponential in the number of attributes none of the existing approaches can be applied on large datasets. We present a new algorithm DFD for discovering all functional dependencies in a dataset following a depth-first traversal strategy of the attribute lattice that combines aggressive pruning and efficient result verification. Our approach is able to scale far beyond existing algorithms for up to 7.5 million tuples, and is up to three orders of magnitude faster than existing approaches on smaller datasets. Winner of the CIKM 2014 Best Student Paper Award

Keywords:

profiling,hpi

BibTeX file

@inproceedings{dfd2014,
author = { Ziawasch Abedjan, Patrick Schulze, Felix Naumann },
title = { DFD: Efficient Discovery of Functional Dependencies },
year = { 2014 },
pages = { 949-958 },
month = { 0 },
abstract = { The discovery of unknown functional dependencies in a dataset is of great importance for database redesign, anomaly detection and data cleansing applications. However, as the nature of the problem is exponential in the number of attributes none of the existing approaches can be applied on large datasets. We present a new algorithm DFD for discovering all functional dependencies in a dataset following a depth-first traversal strategy of the attribute lattice that combines aggressive pruning and efficient result verification. Our approach is able to scale far beyond existing algorithms for up to 7.5 million tuples, and is up to three orders of magnitude faster than existing approaches on smaller datasets. Winner of the CIKM 2014 Best Student Paper Award },
keywords = { profiling,hpi },
booktitle = { In Proceedings of the International Conference on Information and Knowledge Management (CIKM), Shanghai, China },
priority = { 0 }
}

Copyright Notice

last change: Tue, 14 Apr 2015 17:53:50 +0200

Review Activity

  • ACM Transactions on the Web
  • DESWeb 2014