Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

Dr. Hazar Harmouch

Postdoctoral Researcher

Infomation Systems Research Group 
Hasso-Plattner-Institut | Universität Potsdam
Prof.-Dr.-Helmert-Straße 2-3, D-14482 Potsdam
 

Contact Information

 

Research

Research Interests

  • Data quality for AI (fairness, non-discrimination, privacy)
  • Data profiling, integration and cleaning
  • Data mining and machine learning
  • Similarity search

Program committee memberships

  • DBML'22, '23
  • SIGMOD 2022 (research track) [Listed as a distinguished reviewer]
  • BTW 2021, 2022 (demo-program)
  • ICDE 2019 (research track)

Journals Reviewer

  • The VLDB Journal
  • SIGMOD Record
  • TODS: ACM Transactions on Database Systems
  • JDIQ: Journal of Data and Information Quality
  • Semantic web journal
  • TKDE: Transactions on Knowledge and Data Engineering
  • The Computer Journal - Oxford Academic

Invited Talks

Dissertation

Single-column data profiling

 

Teaching

Lectures:

  • Database I (SoS2023) (Planned)

Seminars:

Master Thesis:

  • Semantic Type Detection for Numeric Data (Jonathan Haas, ongoing)
  • Der Einfluss von Datenqualitätsmängeln auf Fairness in KI-Systemen (Isabel Bär, 2022)
  • Discovery of Complementation Dependencies (Jonas Hering, 2022)
  • Inferring Regular Expressions from Database Columns (Tobias Niedling, 2021)
  • Finding Related Tables on the Web (Fabian Windheuser, 2019)

 

Projects

Active Research Projects:

This project is an interdisciplinary research project in collaboration with VDE, Universität Tübingen, and Universität Viadrina.  The main goal of the project is to shed light on the role of AI training and testing data from a legal, ethical, informational and practical perspective. 

Project term: December 2021 to July 2023
Funded by: Federal Ministry of Labour and Social Affairs (BMAS)

 

The Metanome project is a project at HPI in cooperation with the Qatar Computing Reserach Institute (QCRI). Metanome provides a fresh view on data profiling by developing and integrating efficient algorithms into a common tool, expanding on the functionality of data profiling, and addressing performance and scalability issues for Big Data.

 

Completed Research Projects- PostDoc

Rating the value of a real-estate is a complex process relying on local and global properties. Working with a couple of million real-estate valuation data provided by the Deutsche Sparkassenverlag (DSV), the International Real Estate Business School (IREBS) of the University of Regensburg and HPI's Information Systems and the Algorithm Engineering groups collaborate to automate real-estate valuation by means of data engineering and artificial intelligence.

Project term: July 2020 to September 2021
Funded by: Deutscher Sparkassenverlag (DSV)

 

Completed Research Projects- PhD.

 

Publications

2021

  • 1.
    Harmouch, H., Papenbrock, T., Naumann, F.: Relational Header Discovery using Similarity Search in a Table Corpus. IEEE International Conference on Data Engineering (ICDE). 444–455 (2021).
     

2019

  • Inclusion Dependency Disc... - Download
    1.
    Dürsch, F., Stebner, A., Windheuser, F., Fischer, M., Friedrich, T., Strelow, N., Bleifuß, T., Harmouch, H., Jiang, L., Papenbrock, T., Naumann, F.: Inclusion Dependency Discovery: An Experimental Evaluation of Thirteen Algorithms. Proceedings of the International Conference on Information and Knowledge Management (CIKM). pp. 219–228 (2019).
     

2018

  • Discovery of Genuine Func... - Download
    1.
    Berti-Equille, L., Harmouch, H., Naumann, F., Novelli, N., Thirumuruganathan, S.: Discovery of Genuine Functional Dependencies from Relational Data with Missing Values. Proceedings of the VLDB Endowment (PVLDB). pp. 880–892 (2018).
     

2017

  • Cardinality Estimation: A... - Download
    1.
    Harmouch, H., Naumann, F.: Cardinality Estimation: An Experimental Survey. Proceedings of the VLDB Endowment (PVLDB). pp. 499–512 (2017).
     

2016

  • Data Anamnesis: Admitting... - Download
    1.
    Kruse, S., Papenbrock, T., Harmouch, H., Naumann, F.: Data Anamnesis: Admitting Raw Data into an Organization. IEEE Data Engineering Bulletin. 39, 8–20 (2016).