Hasso-Plattner-Institut25 Jahre HPI
Hasso-Plattner-Institut25 Jahre HPI

Data Integration (Sommersemester 2024)

Dozent: Prof. Dr. Felix Naumann (Information Systems) , Sebastian Schmidl (Information Systems)
Website zum Kurs: https://hpi.de/naumann/teaching/current-courses/ss-2024/data-integration.html

Allgemeine Information

  • Semesterwochenstunden: 4
  • ECTS: 6
  • Benotet: Ja
  • Einschreibefrist: 01.04.2024-30.04.2024
  • Lehrform: Vorlesung / Übung
  • Belegungsart: Wahlpflichtmodul
  • Lehrsprache: Englisch

Studiengänge, Modulgruppen & Module

IT-Systems Engineering MA
  • IT-Systems Engineering
    • HPI-ITSE-A Analyse
  • IT-Systems Engineering
    • HPI-ITSE-E Entwurf
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-K Konzepte und Methoden
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-T Techniken und Werkzeuge
Data Engineering MA
Software Systems Engineering MA


Data integration is the merging of heterogeneous information from various data sources to a homogenous, clean dataset. Despite research and development over the past 40 years, collecting and integrating data from multiple sources remains an important and challenging task in any data-oriented or data science project. This lecture covers the basic technologies, such as distributed database architectures, techniques for virtual and materialized integration, data profiling, and data cleansing technologies. It thus combines the previous foundational lectures on information integration and data profiling to lay a foundation for handling unknown data.


  • Database knowledge (e.g. DBS I)

Lern- und Lehrformen

Lecture and exercises


Lecture grading is based 100% on the written exam (approx. 3h) after the end of the teaching period. Requirements for exam admission are:

  • "Passing" all four exercises
  • At least one short presentation of an exercise solution