Data Quality for AI (Wintersemester 2021/2022)
Lecturer: Prof. Dr. Felix Naumann
Dr. Hazar Harmouch
- Weekly Hours: 4
- Credits: 6
- Enrolment Deadline: 01.10.2021 - 22.10.2021
- Teaching Form: Project / Seminar
- Enrolment Type: Compulsory Elective Module
- Course Language: English
- Maximum number of participants: 6
Programs & Modules
- DATA-Konzepte und Methoden
- DATA-Techniken und Werkzeuge
- PREP-Konzepte und Methoden
- PREP-Techniken und Werkzeuge
Many AI methods are dependent on large quantities of suitable training data. This creates challenges not only concerning the availability of data but also regarding its quality. For example, incomplete, erroneous, inappropriate, or asymmetric training data leads to unreliable models and can ultimately lead to poor decisions, which is often referred to by Garbage in, garbage out (GIGO). The traditional definition of data or information quality includes dimensions, such as validity, accuracy, completeness, consistency, and uniformity. Nevertheless, this long-established definition of data quality does not yet consider modern AI systems and their requirements. Furthermore, there is not much research on the explainability of machine learning models in terms of the quality of the training/testing data. In this seminar, we will introduce you to the field of data quality and explore together the correlation between data quality and AI model performance.
For this seminar, participants need to be able to program fluently in Python and know how to use jupyter notebooks. The seminar also requires basic knowledge about machine learning algorithms.
- Project seminar with weekly meetings
- We plan the course to be on-site. However, we will switch to hybrid/online mode if the regulation changes.
- Intermediate and final presentation
- Demonstration and report of method implementation and its experimental results