Data Quality for AI (Wintersemester 2021/2022)
Lecturer:
Prof. Dr. Felix Naumann
(Information Systems)
,
Dr. Hazar Harmouch
(Information Systems)
Course Website:
General Information
- Weekly Hours: 4
- Credits: 6
- Graded:
yes
- Enrolment Deadline: 01.10.2021 - 22.10.2021
- Teaching Form: Project / Seminar
- Enrolment Type: Compulsory Elective Module
- Course Language: English
- Maximum number of participants: 6
Programs, Module Groups & Modules
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-K Konzepte und Methoden
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-S Spezialisierung
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-T Techniken und Werkzeuge
- DATA: Data Analytics
- HPI-DATA-K Konzepte und Methoden
- DATA: Data Analytics
- HPI-DATA-T Techniken und Werkzeuge
- DATA: Data Analytics
- HPI-DATA-S Spezialisierung
- PREP: Data Preparation
- HPI-PREP-K Konzepte und Methoden
- PREP: Data Preparation
- HPI-PREP-T Techniken und Werkzeuge
- PREP: Data Preparation
- HPI-PREP-S Spezialisierung
Description
Many AI methods are dependent on large quantities of suitable training data. This creates challenges not only concerning the availability of data but also regarding its quality. For example, incomplete, erroneous, inappropriate, or asymmetric training data leads to unreliable models and can ultimately lead to poor decisions, which is often referred to by Garbage in, garbage out (GIGO). The traditional definition of data or information quality includes dimensions, such as validity, accuracy, completeness, consistency, and uniformity. Nevertheless, this long-established definition of data quality does not yet consider modern AI systems and their requirements. Furthermore, there is not much research on the explainability of machine learning models in terms of the quality of the training/testing data. In this seminar, we will introduce you to the field of data quality and explore together the correlation between data quality and AI model performance.
Requirements
For this seminar, participants need to be able to program fluently in Python and know how to use jupyter notebooks. The seminar also requires basic knowledge about machine learning algorithms.
Learning
- Project seminar with weekly meetings
- We plan the course to be on-site. However, we will switch to hybrid/online mode if the regulation changes.
Examination
- Intermediate and final presentation
- Demonstration and report of method implementation and its experimental results
Zurück