Explainable Data Matching (Sommersemester 2022)
Lecturer:
Prof. Dr. Felix Naumann
(Information Systems)
Course Website:
https://hpi.de/en/naumann/teaching/current-courses/ss-22/explainable-data-matching.html
General Information
- Weekly Hours: 4
- Credits: 6
- Graded:
yes
- Enrolment Deadline: 01.04.2022 - 30.04.2022
- Teaching Form: Seminar
- Enrolment Type: Compulsory Elective Module
- Course Language: English
- Maximum number of participants: 6
Programs, Module Groups & Modules
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-K Konzepte und Methoden
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-S Spezialisierung
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-T Techniken und Werkzeuge
- PREP: Data Preparation
- HPI-PREP-K Konzepte und Methoden
- PREP: Data Preparation
- HPI-PREP-T Techniken und Werkzeuge
- PREP: Data Preparation
- HPI-PREP-S Spezialisierung
Description
Data matching is the process of detecting (and subsequently cleaning) multiple representations of the same real-world object within a given dataset. Typical approaches create a candidate set of record pairs, determine their similarity, and then compare it to some threshold. Such data matching systems and their components can be quite complex, and understanding their results is difficult. Building upon the data matching benchmark platform Frost and its implementation Snowman (pdf, github), we plan to develop methods to better explain data matching results to developers and domain experts.
Requirements
Foundations and experience in data cleaning and data matching
Literature
Learning
Project seminar with weekly meetings, presentations and discussions
Examination
Presentation and written report
Dates
Please see website
Zurück