Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI

Explainable Data Matching (Sommersemester 2022)

Dozent: Prof. Dr. Felix Naumann (Information Systems)
Website zum Kurs: https://hpi.de/naumann/teaching/current-courses/ss-22/explainable-data-matching.html

Allgemeine Information

  • Semesterwochenstunden: 4
  • ECTS: 6
  • Benotet: Ja
  • Einschreibefrist: 01.04.2022 - 30.04.2022
  • Lehrform: Seminar
  • Belegungsart: Wahlpflichtmodul
  • Lehrsprache: Englisch
  • Maximale Teilnehmerzahl: 6

Studiengänge, Modulgruppen & Module

IT-Systems Engineering MA
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-K Konzepte und Methoden
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-S Spezialisierung
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-T Techniken und Werkzeuge
Data Engineering MA


Data matching is the process of detecting (and subsequently cleaning) multiple representations of the same real-world object within a given dataset. Typical approaches create a candidate set of record pairs, determine their similarity, and then compare it to some threshold. Such data matching systems and their components can be quite complex, and understanding their results is difficult. Building upon the data matching benchmark platform Frost and its implementation Snowman (pdf, github), we plan to develop methods to better explain data matching results to developers and domain experts.


Foundations and experience in data cleaning and data matching


Lern- und Lehrformen

Project seminar with weekly meetings, presentations and discussions


Presentation and written report


Please see website